Overview

Brought to you by YData

Dataset statistics

Number of variables98
Number of observations584201
Missing cells14445452
Missing cells (%)25.2%
Total size in memory436.8 MiB
Average record size in memory784.0 B

Variable types

Text98

Dataset

DescriptionHerpetology NMNH Extant Specimen Records 0054921-241126133413365
URLhttps://doi.org/10.15468/dl.rf2che

Alerts

license has constant value "CC0_1_0" Constant
publisher has constant value "National Museum of Natural History, Smithsonian Institution" Constant
institutionID has constant value "urn:lsid:biocol.org:col:34871" Constant
collectionID has constant value "urn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0" Constant
institutionCode has constant value "USNM" Constant
collectionCode has constant value "HERP" Constant
datasetName has constant value "NMNH Extant Biology" Constant
occurrenceStatus has constant value "PRESENT" Constant
kingdom has constant value "Animalia" Constant
phylum has constant value "Chordata" Constant
datasetKey has constant value "821cc27a-e3bb-4bc5-ac34-89ada245069d" Constant
publishingCountry has constant value "US" Constant
kingdomKey has constant value "1" Constant
phylumKey has constant value "44" Constant
protocol has constant value "EML" Constant
lastCrawled has constant value "2024-12-02T11:48:23.416Z" Constant
publishedByGbifRegion has constant value "NORTH_AMERICA" Constant
recordNumber has 583925 (> 99.9%) missing values Missing
sex has 531942 (91.1%) missing values Missing
lifeStage has 542754 (92.9%) missing values Missing
associatedSequences has 583480 (99.9%) missing values Missing
occurrenceRemarks has 557618 (95.4%) missing values Missing
fieldNumber has 584193 (> 99.9%) missing values Missing
eventDate has 39140 (6.7%) missing values Missing
startDayOfYear has 86170 (14.8%) missing values Missing
endDayOfYear has 86170 (14.8%) missing values Missing
year has 39600 (6.8%) missing values Missing
month has 59025 (10.1%) missing values Missing
day has 100844 (17.3%) missing values Missing
continent has 10069 (1.7%) missing values Missing
waterBody has 555994 (95.2%) missing values Missing
islandGroup has 564324 (96.6%) missing values Missing
island has 576136 (98.6%) missing values Missing
countryCode has 10837 (1.9%) missing values Missing
stateProvince has 17001 (2.9%) missing values Missing
county has 191557 (32.8%) missing values Missing
verbatimElevation has 331608 (56.8%) missing values Missing
decimalLatitude has 162667 (27.8%) missing values Missing
decimalLongitude has 162667 (27.8%) missing values Missing
coordinateUncertaintyInMeters has 439218 (75.2%) missing values Missing
georeferenceProtocol has 439136 (75.2%) missing values Missing
georeferenceRemarks has 443625 (75.9%) missing values Missing
identificationQualifier has 583784 (99.9%) missing values Missing
typeStatus has 571070 (97.8%) missing values Missing
identifiedBy has 584125 (> 99.9%) missing values Missing
order has 189040 (32.4%) missing values Missing
specificEpithet has 15011 (2.6%) missing values Missing
infraspecificEpithet has 559230 (95.7%) missing values Missing
elevation has 332110 (56.8%) missing values Missing
elevationAccuracy has 333288 (57.1%) missing values Missing
distanceFromCentroidInMeters has 581727 (99.6%) missing values Missing
mediaType has 579082 (99.1%) missing values Missing
orderKey has 189040 (32.4%) missing values Missing
speciesKey has 15011 (2.6%) missing values Missing
species has 15011 (2.6%) missing values Missing
repatriated has 10596 (1.8%) missing values Missing
gbifRegion has 11409 (2.0%) missing values Missing
level0Gid has 173676 (29.7%) missing values Missing
level0Name has 173676 (29.7%) missing values Missing
level1Gid has 174349 (29.8%) missing values Missing
level1Name has 174349 (29.8%) missing values Missing
level2Gid has 186113 (31.9%) missing values Missing
level2Name has 186171 (31.9%) missing values Missing
level3Gid has 532468 (91.1%) missing values Missing
level3Name has 532843 (91.2%) missing values Missing
iucnRedListCategory has 23468 (4.0%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique
catalogNumber has unique values Unique

Reproduction

Analysis started2025-01-08 22:55:34.710458
Analysis finished2025-01-08 22:55:58.312133
Duration23.6 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct584201
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:55:58.695874image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters5842010
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique584201 ?
Unique (%)100.0%

Sample

1st row1317203362
2nd row1317203927
3rd row1317204107
4th row1322537851
5th row1322539748
ValueCountFrequency (%)
1317203362 1
 
< 0.1%
1322539748 1
 
< 0.1%
1322560470 1
 
< 0.1%
1322558547 1
 
< 0.1%
1317274722 1
 
< 0.1%
1317214758 1
 
< 0.1%
1317204107 1
 
< 0.1%
1322537851 1
 
< 0.1%
1317211425 1
 
< 0.1%
1322569185 1
 
< 0.1%
Other values (584191) 584191
> 99.9%
2025-01-08T17:55:59.169835image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1289572
22.1%
3 931906
16.0%
2 745858
12.8%
8 464209
 
7.9%
9 461174
 
7.9%
0 439271
 
7.5%
7 430436
 
7.4%
4 371688
 
6.4%
5 355028
 
6.1%
6 352868
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5842010
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1289572
22.1%
3 931906
16.0%
2 745858
12.8%
8 464209
 
7.9%
9 461174
 
7.9%
0 439271
 
7.5%
7 430436
 
7.4%
4 371688
 
6.4%
5 355028
 
6.1%
6 352868
 
6.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5842010
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1289572
22.1%
3 931906
16.0%
2 745858
12.8%
8 464209
 
7.9%
9 461174
 
7.9%
0 439271
 
7.5%
7 430436
 
7.4%
4 371688
 
6.4%
5 355028
 
6.1%
6 352868
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5842010
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1289572
22.1%
3 931906
16.0%
2 745858
12.8%
8 464209
 
7.9%
9 461174
 
7.9%
0 439271
 
7.5%
7 430436
 
7.4%
4 371688
 
6.4%
5 355028
 
6.1%
6 352868
 
6.0%

license
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:55:59.225041image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters4089407
Distinct characters4
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCC0_1_0
2nd rowCC0_1_0
3rd rowCC0_1_0
4th rowCC0_1_0
5th rowCC0_1_0
ValueCountFrequency (%)
cc0_1_0 584201
100.0%
2025-01-08T17:55:59.313948image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 1168402
28.6%
0 1168402
28.6%
_ 1168402
28.6%
1 584201
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1752603
42.9%
Uppercase Letter 1168402
28.6%
Connector Punctuation 1168402
28.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1168402
66.7%
1 584201
33.3%
Uppercase Letter
ValueCountFrequency (%)
C 1168402
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1168402
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2921005
71.4%
Latin 1168402
 
28.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1168402
40.0%
_ 1168402
40.0%
1 584201
20.0%
Latin
ValueCountFrequency (%)
C 1168402
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4089407
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 1168402
28.6%
0 1168402
28.6%
_ 1168402
28.6%
1 584201
14.3%
Distinct11116
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:55:59.450538image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters11684020
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6239 ?
Unique (%)1.1%

Sample

1st row2022-03-25T16:29:00Z
2nd row2022-12-14T12:20:00Z
3rd row2022-07-25T13:54:00Z
4th row2022-03-25T16:12:00Z
5th row2022-03-25T16:41:00Z
ValueCountFrequency (%)
2022-08-17t10:53:00z 3308
 
0.6%
2022-08-17t10:58:00z 3292
 
0.6%
2022-08-17t10:59:00z 3292
 
0.6%
2022-08-17t10:54:00z 3283
 
0.6%
2022-08-17t10:57:00z 3269
 
0.6%
2022-08-17t10:56:00z 3263
 
0.6%
2022-08-17t11:00:00z 3247
 
0.6%
2022-08-17t11:01:00z 3245
 
0.6%
2022-08-17t11:03:00z 3243
 
0.6%
2022-08-17t11:15:00z 3237
 
0.6%
Other values (11106) 551522
94.4%
2025-01-08T17:55:59.626192image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2825947
24.2%
2 1945269
16.6%
1 1362306
11.7%
- 1168402
10.0%
: 1168402
10.0%
T 584201
 
5.0%
Z 584201
 
5.0%
8 454189
 
3.9%
5 397958
 
3.4%
3 369937
 
3.2%
Other values (4) 823208
 
7.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8178814
70.0%
Dash Punctuation 1168402
 
10.0%
Other Punctuation 1168402
 
10.0%
Uppercase Letter 1168402
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2825947
34.6%
2 1945269
23.8%
1 1362306
16.7%
8 454189
 
5.6%
5 397958
 
4.9%
3 369937
 
4.5%
7 249238
 
3.0%
4 229171
 
2.8%
6 174007
 
2.1%
9 170792
 
2.1%
Uppercase Letter
ValueCountFrequency (%)
T 584201
50.0%
Z 584201
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1168402
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1168402
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10515618
90.0%
Latin 1168402
 
10.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2825947
26.9%
2 1945269
18.5%
1 1362306
13.0%
- 1168402
11.1%
: 1168402
11.1%
8 454189
 
4.3%
5 397958
 
3.8%
3 369937
 
3.5%
7 249238
 
2.4%
4 229171
 
2.2%
Other values (2) 344799
 
3.3%
Latin
ValueCountFrequency (%)
T 584201
50.0%
Z 584201
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11684020
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2825947
24.2%
2 1945269
16.6%
1 1362306
11.7%
- 1168402
10.0%
: 1168402
10.0%
T 584201
 
5.0%
Z 584201
 
5.0%
8 454189
 
3.9%
5 397958
 
3.4%
3 369937
 
3.2%
Other values (4) 823208
 
7.0%

publisher
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:55:59.689937image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length59
Median length59
Mean length59
Min length59

Characters and Unicode

Total characters34467859
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNational Museum of Natural History, Smithsonian Institution
2nd rowNational Museum of Natural History, Smithsonian Institution
3rd rowNational Museum of Natural History, Smithsonian Institution
4th rowNational Museum of Natural History, Smithsonian Institution
5th rowNational Museum of Natural History, Smithsonian Institution
ValueCountFrequency (%)
national 584201
14.3%
museum 584201
14.3%
of 584201
14.3%
natural 584201
14.3%
history 584201
14.3%
smithsonian 584201
14.3%
institution 584201
14.3%
2025-01-08T17:55:59.797959image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 4089407
11.9%
i 3505206
10.2%
3505206
10.2%
a 2921005
 
8.5%
o 2921005
 
8.5%
n 2921005
 
8.5%
s 2336804
 
6.8%
u 2336804
 
6.8%
r 1168402
 
3.4%
m 1168402
 
3.4%
Other values (11) 7594613
22.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 26873246
78.0%
Space Separator 3505206
 
10.2%
Uppercase Letter 3505206
 
10.2%
Other Punctuation 584201
 
1.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 4089407
15.2%
i 3505206
13.0%
a 2921005
10.9%
o 2921005
10.9%
n 2921005
10.9%
s 2336804
8.7%
u 2336804
8.7%
r 1168402
 
4.3%
m 1168402
 
4.3%
l 1168402
 
4.3%
Other values (4) 2336804
8.7%
Uppercase Letter
ValueCountFrequency (%)
N 1168402
33.3%
M 584201
16.7%
H 584201
16.7%
S 584201
16.7%
I 584201
16.7%
Space Separator
ValueCountFrequency (%)
3505206
100.0%
Other Punctuation
ValueCountFrequency (%)
, 584201
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 30378452
88.1%
Common 4089407
 
11.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 4089407
13.5%
i 3505206
11.5%
a 2921005
9.6%
o 2921005
9.6%
n 2921005
9.6%
s 2336804
 
7.7%
u 2336804
 
7.7%
r 1168402
 
3.8%
m 1168402
 
3.8%
N 1168402
 
3.8%
Other values (9) 5842010
19.2%
Common
ValueCountFrequency (%)
3505206
85.7%
, 584201
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 34467859
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 4089407
11.9%
i 3505206
10.2%
3505206
10.2%
a 2921005
 
8.5%
o 2921005
 
8.5%
n 2921005
 
8.5%
s 2336804
 
6.8%
u 2336804
 
6.8%
r 1168402
 
3.4%
m 1168402
 
3.4%
Other values (11) 7594613
22.0%

institutionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:55:59.851765image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length29
Min length29

Characters and Unicode

Total characters16941829
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:lsid:biocol.org:col:34871
2nd rowurn:lsid:biocol.org:col:34871
3rd rowurn:lsid:biocol.org:col:34871
4th rowurn:lsid:biocol.org:col:34871
5th rowurn:lsid:biocol.org:col:34871
ValueCountFrequency (%)
urn:lsid:biocol.org:col:34871 584201
100.0%
2025-01-08T17:55:59.954076image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 2336804
13.8%
: 2336804
13.8%
l 1752603
 
10.3%
i 1168402
 
6.9%
r 1168402
 
6.9%
c 1168402
 
6.9%
g 584201
 
3.4%
7 584201
 
3.4%
8 584201
 
3.4%
4 584201
 
3.4%
Other values (8) 4673608
27.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11099819
65.5%
Other Punctuation 2921005
 
17.2%
Decimal Number 2921005
 
17.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 2336804
21.1%
l 1752603
15.8%
i 1168402
10.5%
r 1168402
10.5%
c 1168402
10.5%
g 584201
 
5.3%
u 584201
 
5.3%
b 584201
 
5.3%
d 584201
 
5.3%
s 584201
 
5.3%
Decimal Number
ValueCountFrequency (%)
7 584201
20.0%
8 584201
20.0%
4 584201
20.0%
3 584201
20.0%
1 584201
20.0%
Other Punctuation
ValueCountFrequency (%)
: 2336804
80.0%
. 584201
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11099819
65.5%
Common 5842010
34.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 2336804
21.1%
l 1752603
15.8%
i 1168402
10.5%
r 1168402
10.5%
c 1168402
10.5%
g 584201
 
5.3%
u 584201
 
5.3%
b 584201
 
5.3%
d 584201
 
5.3%
s 584201
 
5.3%
Common
ValueCountFrequency (%)
: 2336804
40.0%
7 584201
 
10.0%
8 584201
 
10.0%
4 584201
 
10.0%
3 584201
 
10.0%
. 584201
 
10.0%
1 584201
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16941829
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 2336804
13.8%
: 2336804
13.8%
l 1752603
 
10.3%
i 1168402
 
6.9%
r 1168402
 
6.9%
c 1168402
 
6.9%
g 584201
 
3.4%
7 584201
 
3.4%
8 584201
 
3.4%
4 584201
 
3.4%
Other values (8) 4673608
27.6%

collectionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:00.009076image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length45
Mean length45
Min length45

Characters and Unicode

Total characters26289045
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0
2nd rowurn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0
3rd rowurn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0
4th rowurn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0
5th rowurn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0
ValueCountFrequency (%)
urn:uuid:cc104cbf-fd8e-4801-9b71-36731a7db1a0 584201
100.0%
2025-01-08T17:56:00.114351image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2921005
 
11.1%
- 2336804
 
8.9%
u 1752603
 
6.7%
c 1752603
 
6.7%
7 1752603
 
6.7%
0 1752603
 
6.7%
b 1752603
 
6.7%
d 1752603
 
6.7%
4 1168402
 
4.4%
f 1168402
 
4.4%
Other values (10) 8178814
31.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11684020
44.4%
Decimal Number 11099819
42.2%
Dash Punctuation 2336804
 
8.9%
Other Punctuation 1168402
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 1752603
15.0%
c 1752603
15.0%
b 1752603
15.0%
d 1752603
15.0%
f 1168402
10.0%
a 1168402
10.0%
i 584201
 
5.0%
r 584201
 
5.0%
e 584201
 
5.0%
n 584201
 
5.0%
Decimal Number
ValueCountFrequency (%)
1 2921005
26.3%
7 1752603
15.8%
0 1752603
15.8%
4 1168402
 
10.5%
8 1168402
 
10.5%
3 1168402
 
10.5%
9 584201
 
5.3%
6 584201
 
5.3%
Dash Punctuation
ValueCountFrequency (%)
- 2336804
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1168402
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 14605025
55.6%
Latin 11684020
44.4%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2921005
20.0%
- 2336804
16.0%
7 1752603
12.0%
0 1752603
12.0%
4 1168402
 
8.0%
: 1168402
 
8.0%
8 1168402
 
8.0%
3 1168402
 
8.0%
9 584201
 
4.0%
6 584201
 
4.0%
Latin
ValueCountFrequency (%)
u 1752603
15.0%
c 1752603
15.0%
b 1752603
15.0%
d 1752603
15.0%
f 1168402
10.0%
a 1168402
10.0%
i 584201
 
5.0%
r 584201
 
5.0%
e 584201
 
5.0%
n 584201
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26289045
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2921005
 
11.1%
- 2336804
 
8.9%
u 1752603
 
6.7%
c 1752603
 
6.7%
7 1752603
 
6.7%
0 1752603
 
6.7%
b 1752603
 
6.7%
d 1752603
 
6.7%
4 1168402
 
4.4%
f 1168402
 
4.4%
Other values (10) 8178814
31.1%

institutionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:00.153275image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2336804
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUSNM
2nd rowUSNM
3rd rowUSNM
4th rowUSNM
5th rowUSNM
ValueCountFrequency (%)
usnm 584201
100.0%
2025-01-08T17:56:00.243136image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 584201
25.0%
S 584201
25.0%
N 584201
25.0%
M 584201
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2336804
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 584201
25.0%
S 584201
25.0%
N 584201
25.0%
M 584201
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2336804
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 584201
25.0%
S 584201
25.0%
N 584201
25.0%
M 584201
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2336804
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 584201
25.0%
S 584201
25.0%
N 584201
25.0%
M 584201
25.0%

collectionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:00.281647image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2336804
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHERP
2nd rowHERP
3rd rowHERP
4th rowHERP
5th rowHERP
ValueCountFrequency (%)
herp 584201
100.0%
2025-01-08T17:56:00.371090image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
H 584201
25.0%
E 584201
25.0%
R 584201
25.0%
P 584201
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2336804
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
H 584201
25.0%
E 584201
25.0%
R 584201
25.0%
P 584201
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2336804
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
H 584201
25.0%
E 584201
25.0%
R 584201
25.0%
P 584201
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2336804
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
H 584201
25.0%
E 584201
25.0%
R 584201
25.0%
P 584201
25.0%

datasetName
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:00.413191image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters11099819
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNMNH Extant Biology
2nd rowNMNH Extant Biology
3rd rowNMNH Extant Biology
4th rowNMNH Extant Biology
5th rowNMNH Extant Biology
ValueCountFrequency (%)
nmnh 584201
33.3%
extant 584201
33.3%
biology 584201
33.3%
2025-01-08T17:56:00.509639image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 1168402
 
10.5%
1168402
 
10.5%
t 1168402
 
10.5%
o 1168402
 
10.5%
M 584201
 
5.3%
H 584201
 
5.3%
E 584201
 
5.3%
x 584201
 
5.3%
a 584201
 
5.3%
n 584201
 
5.3%
Other values (5) 2921005
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6426211
57.9%
Uppercase Letter 3505206
31.6%
Space Separator 1168402
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1168402
18.2%
o 1168402
18.2%
x 584201
9.1%
a 584201
9.1%
n 584201
9.1%
i 584201
9.1%
l 584201
9.1%
g 584201
9.1%
y 584201
9.1%
Uppercase Letter
ValueCountFrequency (%)
N 1168402
33.3%
M 584201
16.7%
H 584201
16.7%
E 584201
16.7%
B 584201
16.7%
Space Separator
ValueCountFrequency (%)
1168402
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9931417
89.5%
Common 1168402
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 1168402
11.8%
t 1168402
11.8%
o 1168402
11.8%
M 584201
 
5.9%
H 584201
 
5.9%
E 584201
 
5.9%
x 584201
 
5.9%
a 584201
 
5.9%
n 584201
 
5.9%
B 584201
 
5.9%
Other values (4) 2336804
23.5%
Common
ValueCountFrequency (%)
1168402
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11099819
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 1168402
 
10.5%
1168402
 
10.5%
t 1168402
 
10.5%
o 1168402
 
10.5%
M 584201
 
5.3%
H 584201
 
5.3%
E 584201
 
5.3%
x 584201
 
5.3%
a 584201
 
5.3%
n 584201
 
5.3%
Other values (5) 2921005
26.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:00.560949image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length18
Mean length18.00021739
Min length18

Characters and Unicode

Total characters10515745
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESERVED_SPECIMEN
2nd rowPRESERVED_SPECIMEN
3rd rowPRESERVED_SPECIMEN
4th rowPRESERVED_SPECIMEN
5th rowPRESERVED_SPECIMEN
ValueCountFrequency (%)
preserved_specimen 584074
> 99.9%
machine_observation 127
 
< 0.1%
2025-01-08T17:56:00.663462image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 2920624
27.8%
R 1168275
11.1%
S 1168275
11.1%
P 1168148
 
11.1%
I 584328
 
5.6%
N 584328
 
5.6%
V 584201
 
5.6%
_ 584201
 
5.6%
C 584201
 
5.6%
M 584201
 
5.6%
Other values (6) 584963
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 9931544
94.4%
Connector Punctuation 584201
 
5.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 2920624
29.4%
R 1168275
11.8%
S 1168275
11.8%
P 1168148
 
11.8%
I 584328
 
5.9%
N 584328
 
5.9%
V 584201
 
5.9%
C 584201
 
5.9%
M 584201
 
5.9%
D 584074
 
5.9%
Other values (5) 889
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 584201
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9931544
94.4%
Common 584201
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 2920624
29.4%
R 1168275
11.8%
S 1168275
11.8%
P 1168148
 
11.8%
I 584328
 
5.9%
N 584328
 
5.9%
V 584201
 
5.9%
C 584201
 
5.9%
M 584201
 
5.9%
D 584074
 
5.9%
Other values (5) 889
 
< 0.1%
Common
ValueCountFrequency (%)
_ 584201
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10515745
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 2920624
27.8%
R 1168275
11.1%
S 1168275
11.1%
P 1168148
 
11.1%
I 584328
 
5.6%
N 584328
 
5.6%
V 584201
 
5.6%
_ 584201
 
5.6%
C 584201
 
5.6%
M 584201
 
5.6%
Other values (6) 584963
 
5.6%

occurrenceID
Text

Unique 

Distinct584201
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:00.966575image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length63
Mean length63
Min length63

Characters and Unicode

Total characters36804663
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique584201 ?
Unique (%)100.0%

Sample

1st rowhttp://n2t.net/ark:/65665/3000ac9b1-ec0b-4be2-939f-464ad355cc84
2nd rowhttp://n2t.net/ark:/65665/30010adfb-58e1-4e98-8d39-ee055b3463fa
3rd rowhttp://n2t.net/ark:/65665/30012ab17-d2a1-470c-a774-540bc6cffb00
4th rowhttp://n2t.net/ark:/65665/3ec02d332-deb7-4b55-ba3d-5a5d6ca577c9
5th rowhttp://n2t.net/ark:/65665/3ec19a125-2484-4fa3-b6b7-7d87199a6994
ValueCountFrequency (%)
http://n2t.net/ark:/65665/3000ac9b1-ec0b-4be2-939f-464ad355cc84 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec19a125-2484-4fa3-b6b7-7d87199a6994 1
 
< 0.1%
http://n2t.net/ark:/65665/3ed02751f-656c-458c-80fa-90bf891a2063 1
 
< 0.1%
http://n2t.net/ark:/65665/3eced04ac-39a4-455a-85e7-7cb0b4299f6b 1
 
< 0.1%
http://n2t.net/ark:/65665/303348f04-82b4-456c-be8d-764af3205229 1
 
< 0.1%
http://n2t.net/ark:/65665/3008b1b21-05b1-4e8d-b34c-1e3a96daecf7 1
 
< 0.1%
http://n2t.net/ark:/65665/30012ab17-d2a1-470c-a774-540bc6cffb00 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec02d332-deb7-4b55-ba3d-5a5d6ca577c9 1
 
< 0.1%
http://n2t.net/ark:/65665/3006575b6-ca0a-42bd-b75d-3241cc3e332d 1
 
< 0.1%
http://n2t.net/ark:/65665/3ed66e63b-4fff-4639-8abf-a635d31dd047 1
 
< 0.1%
Other values (584191) 584191
> 99.9%
2025-01-08T17:56:01.332516image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 2921005
 
7.9%
6 2847614
 
7.7%
- 2336804
 
6.3%
t 2336804
 
6.3%
5 2265995
 
6.2%
a 1826256
 
5.0%
e 1681096
 
4.6%
2 1680524
 
4.6%
3 1680017
 
4.6%
4 1678083
 
4.6%
Other values (16) 15550465
42.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 15919515
43.3%
Lowercase Letter 13874736
37.7%
Other Punctuation 4673608
 
12.7%
Dash Punctuation 2336804
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 2336804
16.8%
a 1826256
13.2%
e 1681096
12.1%
b 1241364
8.9%
n 1168402
8.4%
c 1094913
7.9%
f 1094889
7.9%
d 1094208
7.9%
k 584201
 
4.2%
r 584201
 
4.2%
Other values (2) 1168402
8.4%
Decimal Number
ValueCountFrequency (%)
6 2847614
17.9%
5 2265995
14.2%
2 1680524
10.6%
3 1680017
10.6%
4 1678083
10.5%
9 1244007
7.8%
8 1240305
7.8%
1 1096638
 
6.9%
7 1094431
 
6.9%
0 1091901
 
6.9%
Other Punctuation
ValueCountFrequency (%)
/ 2921005
62.5%
: 1168402
 
25.0%
. 584201
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 2336804
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 22929927
62.3%
Latin 13874736
37.7%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 2921005
12.7%
6 2847614
12.4%
- 2336804
10.2%
5 2265995
9.9%
2 1680524
7.3%
3 1680017
7.3%
4 1678083
7.3%
9 1244007
 
5.4%
8 1240305
 
5.4%
: 1168402
 
5.1%
Other values (4) 3867171
16.9%
Latin
ValueCountFrequency (%)
t 2336804
16.8%
a 1826256
13.2%
e 1681096
12.1%
b 1241364
8.9%
n 1168402
8.4%
c 1094913
7.9%
f 1094889
7.9%
d 1094208
7.9%
k 584201
 
4.2%
r 584201
 
4.2%
Other values (2) 1168402
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36804663
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 2921005
 
7.9%
6 2847614
 
7.7%
- 2336804
 
6.3%
t 2336804
 
6.3%
5 2265995
 
6.2%
a 1826256
 
5.0%
e 1681096
 
4.6%
2 1680524
 
4.6%
3 1680017
 
4.6%
4 1678083
 
4.6%
Other values (16) 15550465
42.3%

catalogNumber
Text

Unique 

Distinct584201
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:01.745801image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length11
Mean length10.93256944
Min length6

Characters and Unicode

Total characters6386818
Distinct characters27
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique584201 ?
Unique (%)100.0%

Sample

1st rowUSNM 231889
2nd rowUSNM 487703
3rd rowUSNM 297347
4th rowUSNM 322261
5th rowUSNM 319170
ValueCountFrequency (%)
usnm 584201
49.5%
herp 5833
 
0.5%
tissue 5706
 
0.5%
image 127
 
< 0.1%
2847 3
 
< 0.1%
2877 3
 
< 0.1%
2872 3
 
< 0.1%
2940 3
 
< 0.1%
2715 3
 
< 0.1%
9 3
 
< 0.1%
Other values (581072) 584183
49.5%
2025-01-08T17:56:02.203131image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
595867
 
9.3%
U 584201
 
9.1%
N 584201
 
9.1%
M 584201
 
9.1%
S 584201
 
9.1%
4 393545
 
6.2%
2 393142
 
6.2%
3 392798
 
6.2%
1 391284
 
6.1%
5 383581
 
6.0%
Other values (17) 1499797
23.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3395944
53.2%
Uppercase Letter 2348470
36.8%
Space Separator 595867
 
9.3%
Lowercase Letter 46537
 
0.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 393545
11.6%
2 393142
11.6%
3 392798
11.6%
1 391284
11.5%
5 383581
11.3%
6 292686
8.6%
7 291064
8.6%
8 290326
8.5%
9 285200
8.4%
0 282318
8.3%
Lowercase Letter
ValueCountFrequency (%)
e 11666
25.1%
s 11412
24.5%
r 5833
12.5%
p 5833
12.5%
i 5706
12.3%
u 5706
12.3%
m 127
 
0.3%
a 127
 
0.3%
g 127
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
U 584201
24.9%
N 584201
24.9%
M 584201
24.9%
S 584201
24.9%
H 5833
 
0.2%
T 5706
 
0.2%
I 127
 
< 0.1%
Space Separator
ValueCountFrequency (%)
595867
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3991811
62.5%
Latin 2395007
37.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 584201
24.4%
N 584201
24.4%
M 584201
24.4%
S 584201
24.4%
e 11666
 
0.5%
s 11412
 
0.5%
H 5833
 
0.2%
r 5833
 
0.2%
p 5833
 
0.2%
T 5706
 
0.2%
Other values (6) 11920
 
0.5%
Common
ValueCountFrequency (%)
595867
14.9%
4 393545
9.9%
2 393142
9.8%
3 392798
9.8%
1 391284
9.8%
5 383581
9.6%
6 292686
7.3%
7 291064
7.3%
8 290326
7.3%
9 285200
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6386818
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
595867
 
9.3%
U 584201
 
9.1%
N 584201
 
9.1%
M 584201
 
9.1%
S 584201
 
9.1%
4 393545
 
6.2%
2 393142
 
6.2%
3 392798
 
6.2%
1 391284
 
6.1%
5 383581
 
6.0%
Other values (17) 1499797
23.5%

recordNumber
Text

Missing 

Distinct273
Distinct (%)98.9%
Missing583925
Missing (%)> 99.9%
Memory size4.5 MiB
2025-01-08T17:56:02.377964image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length9
Mean length8.460144928
Min length1

Characters and Unicode

Total characters2335
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique271 ?
Unique (%)98.2%

Sample

1st rowRWM 20004
2nd rowRWM 19953
3rd rowRWM 19978
4th rowRWM 19932
5th rowRWM 19955
ValueCountFrequency (%)
rwm 182
33.2%
gmu 74
 
13.5%
lc 15
 
2.7%
8 3
 
0.5%
19897 2
 
0.4%
19895 1
 
0.2%
19926 1
 
0.2%
2430 1
 
0.2%
19973 1
 
0.2%
19925 1
 
0.2%
Other values (267) 267
48.7%
2025-01-08T17:56:02.623105image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
272
11.6%
9 260
11.1%
M 257
11.0%
0 245
10.5%
1 190
8.1%
W 182
7.8%
R 182
7.8%
2 165
7.1%
3 95
 
4.1%
G 75
 
3.2%
Other values (9) 412
17.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1262
54.0%
Uppercase Letter 801
34.3%
Space Separator 272
 
11.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 260
20.6%
0 245
19.4%
1 190
15.1%
2 165
13.1%
3 95
 
7.5%
7 71
 
5.6%
6 63
 
5.0%
4 62
 
4.9%
8 57
 
4.5%
5 54
 
4.3%
Uppercase Letter
ValueCountFrequency (%)
M 257
32.1%
W 182
22.7%
R 182
22.7%
G 75
 
9.4%
U 74
 
9.2%
C 15
 
1.9%
L 15
 
1.9%
D 1
 
0.1%
Space Separator
ValueCountFrequency (%)
272
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1534
65.7%
Latin 801
34.3%

Most frequent character per script

Common
ValueCountFrequency (%)
272
17.7%
9 260
16.9%
0 245
16.0%
1 190
12.4%
2 165
10.8%
3 95
 
6.2%
7 71
 
4.6%
6 63
 
4.1%
4 62
 
4.0%
8 57
 
3.7%
Latin
ValueCountFrequency (%)
M 257
32.1%
W 182
22.7%
R 182
22.7%
G 75
 
9.4%
U 74
 
9.2%
C 15
 
1.9%
L 15
 
1.9%
D 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2335
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
272
11.6%
9 260
11.1%
M 257
11.0%
0 245
10.5%
1 190
8.1%
W 182
7.8%
R 182
7.8%
2 165
7.1%
3 95
 
4.1%
G 75
 
3.2%
Other values (9) 412
17.6%
Distinct158
Distinct (%)< 0.1%
Missing4
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-08T17:56:02.718723image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length1
Mean length1.004863086
Min length1

Characters and Unicode

Total characters587038
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique51 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 576101
98.6%
2 1312
 
0.2%
0 1007
 
0.2%
3 830
 
0.1%
5 523
 
0.1%
4 522
 
0.1%
6 386
 
0.1%
7 339
 
0.1%
8 271
 
< 0.1%
10 257
 
< 0.1%
Other values (148) 2649
 
0.5%
2025-01-08T17:56:02.866528image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 577649
98.4%
2 2199
 
0.4%
0 2065
 
0.4%
3 1313
 
0.2%
5 1043
 
0.2%
4 852
 
0.1%
6 611
 
0.1%
7 518
 
0.1%
8 428
 
0.1%
9 360
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 587038
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 577649
98.4%
2 2199
 
0.4%
0 2065
 
0.4%
3 1313
 
0.2%
5 1043
 
0.2%
4 852
 
0.1%
6 611
 
0.1%
7 518
 
0.1%
8 428
 
0.1%
9 360
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 587038
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 577649
98.4%
2 2199
 
0.4%
0 2065
 
0.4%
3 1313
 
0.2%
5 1043
 
0.2%
4 852
 
0.1%
6 611
 
0.1%
7 518
 
0.1%
8 428
 
0.1%
9 360
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 587038
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 577649
98.4%
2 2199
 
0.4%
0 2065
 
0.4%
3 1313
 
0.2%
5 1043
 
0.2%
4 852
 
0.1%
6 611
 
0.1%
7 518
 
0.1%
8 428
 
0.1%
9 360
 
0.1%

sex
Text

Missing 

Distinct3
Distinct (%)< 0.1%
Missing531942
Missing (%)91.1%
Memory size4.5 MiB
2025-01-08T17:56:02.909528image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length4
Mean length4.859507453
Min length4

Characters and Unicode

Total characters253953
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowMALE
2nd rowMALE
3rd rowFEMALE
4th rowMALE
5th rowFEMALE
ValueCountFrequency (%)
male 29804
57.0%
female 22454
43.0%
hermaphrodite 1
 
< 0.1%
2025-01-08T17:56:02.999525image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 74714
29.4%
M 52259
20.6%
A 52259
20.6%
L 52258
20.6%
F 22454
 
8.8%
H 2
 
< 0.1%
R 2
 
< 0.1%
P 1
 
< 0.1%
O 1
 
< 0.1%
D 1
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 253953
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 74714
29.4%
M 52259
20.6%
A 52259
20.6%
L 52258
20.6%
F 22454
 
8.8%
H 2
 
< 0.1%
R 2
 
< 0.1%
P 1
 
< 0.1%
O 1
 
< 0.1%
D 1
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 253953
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 74714
29.4%
M 52259
20.6%
A 52259
20.6%
L 52258
20.6%
F 22454
 
8.8%
H 2
 
< 0.1%
R 2
 
< 0.1%
P 1
 
< 0.1%
O 1
 
< 0.1%
D 1
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 253953
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 74714
29.4%
M 52259
20.6%
A 52259
20.6%
L 52258
20.6%
F 22454
 
8.8%
H 2
 
< 0.1%
R 2
 
< 0.1%
P 1
 
< 0.1%
O 1
 
< 0.1%
D 1
 
< 0.1%
Other values (2) 2
 
< 0.1%

lifeStage
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing542754
Missing (%)92.9%
Memory size4.5 MiB
2025-01-08T17:56:03.049606image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length8
Mean length6.744082805
Min length3

Characters and Unicode

Total characters279522
Distinct characters30
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLarva
2nd rowEgg
3rd rowLarva
4th rowJuvenile
5th rowJuvenile
ValueCountFrequency (%)
juvenile 20321
49.0%
larva 11464
27.7%
adult 3710
 
9.0%
hatchling 2380
 
5.7%
embryo 1048
 
2.5%
egg 838
 
2.0%
neonate 656
 
1.6%
subadult 528
 
1.3%
eft 387
 
0.9%
immature 88
 
0.2%
Other values (2) 27
 
0.1%
2025-01-08T17:56:03.151212image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 42069
15.1%
v 31785
11.4%
l 26962
9.6%
a 26603
9.5%
u 25179
9.0%
n 23357
8.4%
i 22701
8.1%
J 20321
7.3%
r 12600
 
4.5%
L 11464
 
4.1%
Other values (20) 36481
13.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 238075
85.2%
Uppercase Letter 41447
 
14.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 42069
17.7%
v 31785
13.4%
l 26962
11.3%
a 26603
11.2%
u 25179
10.6%
n 23357
9.8%
i 22701
9.5%
r 12600
 
5.3%
t 7753
 
3.3%
d 4261
 
1.8%
Other values (10) 14805
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
J 20321
49.0%
L 11464
27.7%
A 3710
 
9.0%
H 2380
 
5.7%
E 2273
 
5.5%
N 656
 
1.6%
S 528
 
1.3%
I 88
 
0.2%
T 23
 
0.1%
F 4
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 279522
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 42069
15.1%
v 31785
11.4%
l 26962
9.6%
a 26603
9.5%
u 25179
9.0%
n 23357
8.4%
i 22701
8.1%
J 20321
7.3%
r 12600
 
4.5%
L 11464
 
4.1%
Other values (20) 36481
13.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 279522
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 42069
15.1%
v 31785
11.4%
l 26962
9.6%
a 26603
9.5%
u 25179
9.0%
n 23357
8.4%
i 22701
8.1%
J 20321
7.3%
r 12600
 
4.5%
L 11464
 
4.1%
Other values (20) 36481
13.1%

occurrenceStatus
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:03.192958image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters4089407
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESENT
2nd rowPRESENT
3rd rowPRESENT
4th rowPRESENT
5th rowPRESENT
ValueCountFrequency (%)
present 584201
100.0%
2025-01-08T17:56:03.285038image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 1168402
28.6%
P 584201
14.3%
R 584201
14.3%
S 584201
14.3%
N 584201
14.3%
T 584201
14.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4089407
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 1168402
28.6%
P 584201
14.3%
R 584201
14.3%
S 584201
14.3%
N 584201
14.3%
T 584201
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 4089407
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 1168402
28.6%
P 584201
14.3%
R 584201
14.3%
S 584201
14.3%
N 584201
14.3%
T 584201
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4089407
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 1168402
28.6%
P 584201
14.3%
R 584201
14.3%
S 584201
14.3%
N 584201
14.3%
T 584201
14.3%
Distinct31
Distinct (%)< 0.1%
Missing5684
Missing (%)1.0%
Memory size4.5 MiB
2025-01-08T17:56:03.330690image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length53
Median length7
Mean length7.117061383
Min length3

Characters and Unicode

Total characters4117341
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st rowEthanol
2nd rowEthanol; Histological Material
3rd rowEthanol; Dry
4th rowEthanol
5th rowEthanol
ValueCountFrequency (%)
ethanol 553871
93.4%
dry 13058
 
2.2%
formalin 8143
 
1.4%
cleared 4474
 
0.8%
and 4474
 
0.8%
stained 4474
 
0.8%
histological 2058
 
0.3%
material 2058
 
0.3%
photograph 126
 
< 0.1%
sem 3
 
< 0.1%
2025-01-08T17:56:03.438643image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 581736
14.1%
l 572662
13.9%
n 570962
13.9%
o 566382
13.8%
t 562587
13.7%
h 554123
13.5%
E 553874
13.5%
r 27859
 
0.7%
i 18791
 
0.5%
e 15480
 
0.4%
Other values (16) 92885
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3511631
85.3%
Uppercase Letter 588271
 
14.3%
Space Separator 14223
 
0.3%
Other Punctuation 3216
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 581736
16.6%
l 572662
16.3%
n 570962
16.3%
o 566382
16.1%
t 562587
16.0%
h 554123
15.8%
r 27859
 
0.8%
i 18791
 
0.5%
e 15480
 
0.4%
d 13422
 
0.4%
Other values (6) 27627
 
0.8%
Uppercase Letter
ValueCountFrequency (%)
E 553874
94.2%
D 13058
 
2.2%
F 8143
 
1.4%
S 4477
 
0.8%
C 4474
 
0.8%
M 2061
 
0.4%
H 2058
 
0.3%
P 126
 
< 0.1%
Space Separator
ValueCountFrequency (%)
14223
100.0%
Other Punctuation
ValueCountFrequency (%)
; 3216
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4099902
99.6%
Common 17439
 
0.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 581736
14.2%
l 572662
14.0%
n 570962
13.9%
o 566382
13.8%
t 562587
13.7%
h 554123
13.5%
E 553874
13.5%
r 27859
 
0.7%
i 18791
 
0.5%
e 15480
 
0.4%
Other values (14) 75446
 
1.8%
Common
ValueCountFrequency (%)
14223
81.6%
; 3216
 
18.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4117341
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 581736
14.1%
l 572662
13.9%
n 570962
13.9%
o 566382
13.8%
t 562587
13.7%
h 554123
13.5%
E 553874
13.5%
r 27859
 
0.7%
i 18791
 
0.5%
e 15480
 
0.4%
Other values (16) 92885
 
2.3%

associatedSequences
Text

Missing 

Distinct719
Distinct (%)99.7%
Missing583480
Missing (%)99.9%
Memory size4.5 MiB
2025-01-08T17:56:03.510545image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length699
Median length99
Mean length112.1983356
Min length49

Characters and Unicode

Total characters80895
Distinct characters55
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique717 ?
Unique (%)99.4%

Sample

1st rowhttps://www.ncbi.nlm.nih.gov/gquery?term=AF199141;https://www.ncbi.nlm.nih.gov/gquery?term=AF199204
2nd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=OM928184;https://www.ncbi.nlm.nih.gov/gquery?term=OM943246
3rd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=JQ914700
4th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=FJ613461
5th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=FJ766602;https://www.ncbi.nlm.nih.gov/gquery?term=FJ784443
ValueCountFrequency (%)
https://www.ncbi.nlm.nih.gov/gquery?term=jn112709;https://www.ncbi.nlm.nih.gov/gquery?term=jn112771;https://www.ncbi.nlm.nih.gov/gquery?term=jn112642 2
 
0.3%
https://www.ncbi.nlm.nih.gov/gquery?term=ay604497 2
 
0.3%
https://www.ncbi.nlm.nih.gov/gquery?term=fj976636 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=jn377389;https://www.ncbi.nlm.nih.gov/gquery?term=jn377393;https://www.ncbi.nlm.nih.gov/gquery?term=jn377405 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=kc129216;https://www.ncbi.nlm.nih.gov/gquery?term=kc129324 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=ay604512 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=fj766829;https://www.ncbi.nlm.nih.gov/gquery?term=fj784465 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=om928184;https://www.ncbi.nlm.nih.gov/gquery?term=om943246 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=jq914700 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=fj613461 1
 
0.1%
Other values (709) 709
98.3%
2025-01-08T17:56:03.642431image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 6533
 
8.1%
t 4896
 
6.1%
/ 4896
 
6.1%
w 4896
 
6.1%
n 4896
 
6.1%
h 3264
 
4.0%
r 3264
 
4.0%
i 3264
 
4.0%
e 3264
 
4.0%
m 3264
 
4.0%
Other values (45) 38458
47.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 50592
62.5%
Other Punctuation 15604
 
19.3%
Decimal Number 9801
 
12.1%
Uppercase Letter 3266
 
4.0%
Math Symbol 1632
 
2.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
J 653
20.0%
F 619
19.0%
M 450
13.8%
K 434
13.3%
A 177
 
5.4%
Y 150
 
4.6%
Q 129
 
3.9%
H 104
 
3.2%
N 86
 
2.6%
O 76
 
2.3%
Other values (10) 388
11.9%
Lowercase Letter
ValueCountFrequency (%)
t 4896
 
9.7%
w 4896
 
9.7%
n 4896
 
9.7%
h 3264
 
6.5%
r 3264
 
6.5%
i 3264
 
6.5%
e 3264
 
6.5%
m 3264
 
6.5%
g 3264
 
6.5%
q 1632
 
3.2%
Other values (9) 14688
29.0%
Decimal Number
ValueCountFrequency (%)
4 1506
15.4%
6 1267
12.9%
7 1220
12.4%
8 1087
11.1%
3 960
9.8%
1 840
8.6%
5 786
8.0%
2 763
7.8%
9 763
7.8%
0 609
6.2%
Other Punctuation
ValueCountFrequency (%)
. 6533
41.9%
/ 4896
31.4%
? 1632
 
10.5%
: 1632
 
10.5%
; 911
 
5.8%
Math Symbol
ValueCountFrequency (%)
= 1632
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 53858
66.6%
Common 27037
33.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 4896
 
9.1%
w 4896
 
9.1%
n 4896
 
9.1%
h 3264
 
6.1%
r 3264
 
6.1%
i 3264
 
6.1%
e 3264
 
6.1%
m 3264
 
6.1%
g 3264
 
6.1%
q 1632
 
3.0%
Other values (29) 17954
33.3%
Common
ValueCountFrequency (%)
. 6533
24.2%
/ 4896
18.1%
= 1632
 
6.0%
? 1632
 
6.0%
: 1632
 
6.0%
4 1506
 
5.6%
6 1267
 
4.7%
7 1220
 
4.5%
8 1087
 
4.0%
3 960
 
3.6%
Other values (6) 4672
17.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 80895
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 6533
 
8.1%
t 4896
 
6.1%
/ 4896
 
6.1%
w 4896
 
6.1%
n 4896
 
6.1%
h 3264
 
4.0%
r 3264
 
4.0%
i 3264
 
4.0%
e 3264
 
4.0%
m 3264
 
4.0%
Other values (45) 38458
47.5%

occurrenceRemarks
Text

Missing 

Distinct5339
Distinct (%)20.1%
Missing557618
Missing (%)95.4%
Memory size4.5 MiB
2025-01-08T17:56:03.813854image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1294
Median length381
Mean length66.70947598
Min length3

Characters and Unicode

Total characters1773338
Distinct characters91
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3351 ?
Unique (%)12.6%

Sample

1st rowCollected from vegetation removal plot (Cocolob 2) in coastal strand Cocolobo uvifera forest, ca. 10 m inland from beach.
2nd rowCollected in roadside ditch in gum/bay swamp. Water depth: 10-40 cm.
3rd rowComplete clutch of eggs removed from the ovaries of a female (Total Length: 57 inches) collected along wooded road.
4th rowCollected on surface at night.
5th rowCollected above and below the falls, south of the creek.
ValueCountFrequency (%)
collected 21028
 
7.1%
in 15429
 
5.2%
of 11658
 
3.9%
the 11088
 
3.7%
on 10611
 
3.6%
from 7596
 
2.6%
and 5597
 
1.9%
at 5284
 
1.8%
area 4127
 
1.4%
road 4049
 
1.4%
Other values (6088) 200792
67.5%
2025-01-08T17:56:04.074123image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
270676
15.3%
e 160711
 
9.1%
o 140496
 
7.9%
a 114427
 
6.5%
t 108900
 
6.1%
l 98347
 
5.5%
n 89158
 
5.0%
r 81415
 
4.6%
d 76949
 
4.3%
i 72320
 
4.1%
Other values (81) 559939
31.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1313680
74.1%
Space Separator 270676
 
15.3%
Uppercase Letter 64926
 
3.7%
Decimal Number 55444
 
3.1%
Other Punctuation 51683
 
2.9%
Open Punctuation 5630
 
0.3%
Close Punctuation 5620
 
0.3%
Dash Punctuation 5566
 
0.3%
Math Symbol 113
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 160711
12.2%
o 140496
10.7%
a 114427
 
8.7%
t 108900
 
8.3%
l 98347
 
7.5%
n 89158
 
6.8%
r 81415
 
6.2%
d 76949
 
5.9%
i 72320
 
5.5%
s 64649
 
4.9%
Other values (23) 306308
23.3%
Uppercase Letter
ValueCountFrequency (%)
C 24437
37.6%
P 4884
 
7.5%
N 3946
 
6.1%
A 3756
 
5.8%
S 3750
 
5.8%
T 2957
 
4.6%
R 2957
 
4.6%
M 2263
 
3.5%
F 1854
 
2.9%
H 1826
 
2.8%
Other values (16) 12296
18.9%
Other Punctuation
ValueCountFrequency (%)
. 35953
69.6%
, 7909
 
15.3%
: 3238
 
6.3%
" 1687
 
3.3%
; 1332
 
2.6%
' 566
 
1.1%
/ 486
 
0.9%
% 225
 
0.4%
# 199
 
0.4%
? 57
 
0.1%
Other values (2) 31
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 12010
21.7%
0 9775
17.6%
2 7516
13.6%
9 5201
9.4%
8 4032
 
7.3%
5 3818
 
6.9%
3 3764
 
6.8%
7 3390
 
6.1%
6 3356
 
6.1%
4 2582
 
4.7%
Math Symbol
ValueCountFrequency (%)
= 100
88.5%
+ 7
 
6.2%
< 4
 
3.5%
> 2
 
1.8%
Open Punctuation
ValueCountFrequency (%)
( 5543
98.5%
[ 87
 
1.5%
Close Punctuation
ValueCountFrequency (%)
) 5533
98.5%
] 87
 
1.5%
Space Separator
ValueCountFrequency (%)
270676
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5566
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1378606
77.7%
Common 394732
 
22.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 160711
11.7%
o 140496
 
10.2%
a 114427
 
8.3%
t 108900
 
7.9%
l 98347
 
7.1%
n 89158
 
6.5%
r 81415
 
5.9%
d 76949
 
5.6%
i 72320
 
5.2%
s 64649
 
4.7%
Other values (49) 371234
26.9%
Common
ValueCountFrequency (%)
270676
68.6%
. 35953
 
9.1%
1 12010
 
3.0%
0 9775
 
2.5%
, 7909
 
2.0%
2 7516
 
1.9%
- 5566
 
1.4%
( 5543
 
1.4%
) 5533
 
1.4%
9 5201
 
1.3%
Other values (22) 29050
 
7.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1773305
> 99.9%
None 33
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
270676
15.3%
e 160711
 
9.1%
o 140496
 
7.9%
a 114427
 
6.5%
t 108900
 
6.1%
l 98347
 
5.5%
n 89158
 
5.0%
r 81415
 
4.6%
d 76949
 
4.3%
i 72320
 
4.1%
Other values (74) 559906
31.6%
None
ValueCountFrequency (%)
ö 14
42.4%
á 7
21.2%
é 5
 
15.2%
ó 2
 
6.1%
ü 2
 
6.1%
è 2
 
6.1%
ñ 1
 
3.0%

fieldNumber
Text

Missing 

Distinct2
Distinct (%)25.0%
Missing584193
Missing (%)> 99.9%
Memory size4.5 MiB
2025-01-08T17:56:04.125314image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length6
Mean length6.125
Min length6

Characters and Unicode

Total characters49
Distinct characters8
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)12.5%

Sample

1st row83-012
2nd row83-012
3rd row83-012
4th row83-012
5th row83-012
ValueCountFrequency (%)
83-012 7
87.5%
83-024a 1
 
12.5%
2025-01-08T17:56:04.218503image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 8
16.3%
3 8
16.3%
- 8
16.3%
0 8
16.3%
2 8
16.3%
1 7
14.3%
4 1
 
2.0%
A 1
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 40
81.6%
Dash Punctuation 8
 
16.3%
Uppercase Letter 1
 
2.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 8
20.0%
3 8
20.0%
0 8
20.0%
2 8
20.0%
1 7
17.5%
4 1
 
2.5%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 48
98.0%
Latin 1
 
2.0%

Most frequent character per script

Common
ValueCountFrequency (%)
8 8
16.7%
3 8
16.7%
- 8
16.7%
0 8
16.7%
2 8
16.7%
1 7
14.6%
4 1
 
2.1%
Latin
ValueCountFrequency (%)
A 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 49
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 8
16.3%
3 8
16.3%
- 8
16.3%
0 8
16.3%
2 8
16.3%
1 7
14.3%
4 1
 
2.0%
A 1
 
2.0%

eventDate
Text

Missing 

Distinct31039
Distinct (%)5.7%
Missing39140
Missing (%)6.7%
Memory size4.5 MiB
2025-01-08T17:56:04.412660image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length10
Mean length9.940764428
Min length4

Characters and Unicode

Total characters5418323
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7117 ?
Unique (%)1.3%

Sample

1st row1972-02-01/1972-02-03
2nd row1971-09-03
3rd row1992-10-15
4th row1992-06-24
5th row1998-09-03
ValueCountFrequency (%)
1883 739
 
0.1%
1973-09-22 723
 
0.1%
1935 701
 
0.1%
1998-10-09 690
 
0.1%
1971-08-16 610
 
0.1%
1940 598
 
0.1%
1966-04-11 579
 
0.1%
1970-06-19 564
 
0.1%
1976-10-03 540
 
0.1%
1971-07-31 521
 
0.1%
Other values (31029) 538796
98.9%
2025-01-08T17:56:04.682890image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 1054903
19.5%
1 987143
18.2%
0 809873
14.9%
9 731463
13.5%
2 355880
 
6.6%
7 294421
 
5.4%
6 287515
 
5.3%
8 286448
 
5.3%
3 210910
 
3.9%
5 208423
 
3.8%
Other values (2) 191344
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4348746
80.3%
Dash Punctuation 1054903
 
19.5%
Other Punctuation 14674
 
0.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 987143
22.7%
0 809873
18.6%
9 731463
16.8%
2 355880
 
8.2%
7 294421
 
6.8%
6 287515
 
6.6%
8 286448
 
6.6%
3 210910
 
4.8%
5 208423
 
4.8%
4 176670
 
4.1%
Dash Punctuation
ValueCountFrequency (%)
- 1054903
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 14674
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5418323
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 1054903
19.5%
1 987143
18.2%
0 809873
14.9%
9 731463
13.5%
2 355880
 
6.6%
7 294421
 
5.4%
6 287515
 
5.3%
8 286448
 
5.3%
3 210910
 
3.9%
5 208423
 
3.8%
Other values (2) 191344
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5418323
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 1054903
19.5%
1 987143
18.2%
0 809873
14.9%
9 731463
13.5%
2 355880
 
6.6%
7 294421
 
5.4%
6 287515
 
5.3%
8 286448
 
5.3%
3 210910
 
3.9%
5 208423
 
3.8%
Other values (2) 191344
 
3.5%

startDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing86170
Missing (%)14.8%
Memory size4.5 MiB
2025-01-08T17:56:04.885574image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.783003468
Min length1

Characters and Unicode

Total characters1386022
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row32
2nd row246
3rd row289
4th row176
5th row246
ValueCountFrequency (%)
227 2917
 
0.6%
230 2852
 
0.6%
233 2687
 
0.5%
196 2660
 
0.5%
210 2604
 
0.5%
232 2592
 
0.5%
145 2504
 
0.5%
106 2489
 
0.5%
228 2467
 
0.5%
209 2408
 
0.5%
Other values (356) 471851
94.7%
2025-01-08T17:56:05.153491image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 290344
20.9%
2 266757
19.2%
3 147664
10.7%
0 99736
 
7.2%
4 99682
 
7.2%
8 97615
 
7.0%
6 97257
 
7.0%
9 96182
 
6.9%
7 95676
 
6.9%
5 95109
 
6.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1386022
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 290344
20.9%
2 266757
19.2%
3 147664
10.7%
0 99736
 
7.2%
4 99682
 
7.2%
8 97615
 
7.0%
6 97257
 
7.0%
9 96182
 
6.9%
7 95676
 
6.9%
5 95109
 
6.9%

Most occurring scripts

ValueCountFrequency (%)
Common 1386022
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 290344
20.9%
2 266757
19.2%
3 147664
10.7%
0 99736
 
7.2%
4 99682
 
7.2%
8 97615
 
7.0%
6 97257
 
7.0%
9 96182
 
6.9%
7 95676
 
6.9%
5 95109
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1386022
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 290344
20.9%
2 266757
19.2%
3 147664
10.7%
0 99736
 
7.2%
4 99682
 
7.2%
8 97615
 
7.0%
6 97257
 
7.0%
9 96182
 
6.9%
7 95676
 
6.9%
5 95109
 
6.9%

endDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing86170
Missing (%)14.8%
Memory size4.5 MiB
2025-01-08T17:56:05.351710image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.783690172
Min length1

Characters and Unicode

Total characters1386364
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row34
2nd row246
3rd row289
4th row176
5th row246
ValueCountFrequency (%)
230 3038
 
0.6%
227 2924
 
0.6%
233 2713
 
0.5%
196 2664
 
0.5%
210 2658
 
0.5%
232 2593
 
0.5%
145 2544
 
0.5%
226 2520
 
0.5%
228 2516
 
0.5%
209 2373
 
0.5%
Other values (356) 471488
94.7%
2025-01-08T17:56:05.612228image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 291087
21.0%
2 266673
19.2%
3 148129
10.7%
0 99590
 
7.2%
4 99486
 
7.2%
8 98001
 
7.1%
9 96639
 
7.0%
6 96151
 
6.9%
5 95612
 
6.9%
7 94996
 
6.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1386364
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 291087
21.0%
2 266673
19.2%
3 148129
10.7%
0 99590
 
7.2%
4 99486
 
7.2%
8 98001
 
7.1%
9 96639
 
7.0%
6 96151
 
6.9%
5 95612
 
6.9%
7 94996
 
6.9%

Most occurring scripts

ValueCountFrequency (%)
Common 1386364
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 291087
21.0%
2 266673
19.2%
3 148129
10.7%
0 99590
 
7.2%
4 99486
 
7.2%
8 98001
 
7.1%
9 96639
 
7.0%
6 96151
 
6.9%
5 95612
 
6.9%
7 94996
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1386364
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 291087
21.0%
2 266673
19.2%
3 148129
10.7%
0 99590
 
7.2%
4 99486
 
7.2%
8 98001
 
7.1%
9 96639
 
7.0%
6 96151
 
6.9%
5 95612
 
6.9%
7 94996
 
6.9%

year
Text

Missing 

Distinct184
Distinct (%)< 0.1%
Missing39600
Missing (%)6.8%
Memory size4.5 MiB
2025-01-08T17:56:05.785430image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2178404
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st row1972
2nd row1971
3rd row1992
4th row1992
5th row1998
ValueCountFrequency (%)
1971 16999
 
3.1%
1966 15984
 
2.9%
1969 15769
 
2.9%
1970 15631
 
2.9%
1976 15292
 
2.8%
1980 15179
 
2.8%
1979 14958
 
2.7%
1972 14412
 
2.6%
1961 12797
 
2.3%
1984 12646
 
2.3%
Other values (174) 394934
72.5%
2025-01-08T17:56:06.127541image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 627645
28.8%
1 599705
27.5%
7 176112
 
8.1%
6 174293
 
8.0%
8 162231
 
7.4%
0 112827
 
5.2%
2 89669
 
4.1%
5 85387
 
3.9%
3 81823
 
3.8%
4 68712
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2178404
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 627645
28.8%
1 599705
27.5%
7 176112
 
8.1%
6 174293
 
8.0%
8 162231
 
7.4%
0 112827
 
5.2%
2 89669
 
4.1%
5 85387
 
3.9%
3 81823
 
3.8%
4 68712
 
3.2%

Most occurring scripts

ValueCountFrequency (%)
Common 2178404
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 627645
28.8%
1 599705
27.5%
7 176112
 
8.1%
6 174293
 
8.0%
8 162231
 
7.4%
0 112827
 
5.2%
2 89669
 
4.1%
5 85387
 
3.9%
3 81823
 
3.8%
4 68712
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2178404
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 627645
28.8%
1 599705
27.5%
7 176112
 
8.1%
6 174293
 
8.0%
8 162231
 
7.4%
0 112827
 
5.2%
2 89669
 
4.1%
5 85387
 
3.9%
3 81823
 
3.8%
4 68712
 
3.2%

month
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing59025
Missing (%)10.1%
Memory size4.5 MiB
2025-01-08T17:56:06.187202image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.163293829
Min length1

Characters and Unicode

Total characters610934
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row9
3rd row10
4th row6
5th row9
ValueCountFrequency (%)
8 67450
12.8%
5 63954
12.2%
7 63917
12.2%
6 59064
11.2%
4 55219
10.5%
3 46402
8.8%
10 42862
8.2%
9 36546
7.0%
11 25432
 
4.8%
2 25273
 
4.8%
Other values (2) 39057
7.4%
2025-01-08T17:56:06.287494image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 132783
21.7%
8 67450
11.0%
5 63954
10.5%
7 63917
10.5%
6 59064
9.7%
4 55219
9.0%
3 46402
 
7.6%
0 42862
 
7.0%
2 42737
 
7.0%
9 36546
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 610934
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 132783
21.7%
8 67450
11.0%
5 63954
10.5%
7 63917
10.5%
6 59064
9.7%
4 55219
9.0%
3 46402
 
7.6%
0 42862
 
7.0%
2 42737
 
7.0%
9 36546
 
6.0%

Most occurring scripts

ValueCountFrequency (%)
Common 610934
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 132783
21.7%
8 67450
11.0%
5 63954
10.5%
7 63917
10.5%
6 59064
9.7%
4 55219
9.0%
3 46402
 
7.6%
0 42862
 
7.0%
2 42737
 
7.0%
9 36546
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 610934
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 132783
21.7%
8 67450
11.0%
5 63954
10.5%
7 63917
10.5%
6 59064
9.7%
4 55219
9.0%
3 46402
 
7.6%
0 42862
 
7.0%
2 42737
 
7.0%
9 36546
 
6.0%

day
Text

Missing 

Distinct31
Distinct (%)< 0.1%
Missing100844
Missing (%)17.3%
Memory size4.5 MiB
2025-01-08T17:56:06.353733image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.71638768
Min length1

Characters and Unicode

Total characters829628
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row15
3rd row24
4th row3
5th row29
ValueCountFrequency (%)
15 18907
 
3.9%
13 17259
 
3.6%
21 17015
 
3.5%
25 16944
 
3.5%
19 16843
 
3.5%
24 16667
 
3.4%
16 16371
 
3.4%
3 16363
 
3.4%
22 16276
 
3.4%
28 16272
 
3.4%
Other values (21) 314440
65.1%
2025-01-08T17:56:06.480035image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 220646
26.6%
2 206016
24.8%
3 73256
 
8.8%
5 51604
 
6.2%
8 47441
 
5.7%
9 46929
 
5.7%
0 46463
 
5.6%
4 46187
 
5.6%
6 45993
 
5.5%
7 45093
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 829628
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 220646
26.6%
2 206016
24.8%
3 73256
 
8.8%
5 51604
 
6.2%
8 47441
 
5.7%
9 46929
 
5.7%
0 46463
 
5.6%
4 46187
 
5.6%
6 45993
 
5.5%
7 45093
 
5.4%

Most occurring scripts

ValueCountFrequency (%)
Common 829628
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 220646
26.6%
2 206016
24.8%
3 73256
 
8.8%
5 51604
 
6.2%
8 47441
 
5.7%
9 46929
 
5.7%
0 46463
 
5.6%
4 46187
 
5.6%
6 45993
 
5.5%
7 45093
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 829628
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 220646
26.6%
2 206016
24.8%
3 73256
 
8.8%
5 51604
 
6.2%
8 47441
 
5.7%
9 46929
 
5.7%
0 46463
 
5.6%
4 46187
 
5.6%
6 45993
 
5.5%
7 45093
 
5.4%
Distinct42558
Distinct (%)7.3%
Missing51
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-08T17:56:06.662101image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length194
Median length11
Mean length12.14387743
Min length4

Characters and Unicode

Total characters7093846
Distinct characters74
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14192 ?
Unique (%)2.4%

Sample

1st row01-03 February 1972
2nd row3 Sep 1971
3rd row-- --- ----
4th row15 Oct 1992; 09:05-13:00 hrs
5th row24 Jun 1992; 10:30-11:40 hrs
ValueCountFrequency (%)
173374
 
9.4%
may 65316
 
3.5%
aug 63760
 
3.5%
jul 58386
 
3.2%
jun 53770
 
2.9%
apr 50984
 
2.8%
mar 43098
 
2.3%
oct 40349
 
2.2%
sep 34295
 
1.9%
hrs 24306
 
1.3%
Other values (3264) 1238022
67.1%
2025-01-08T17:56:06.928571image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1261510
17.8%
1 874532
 
12.3%
9 688315
 
9.7%
- 499756
 
7.0%
2 328876
 
4.6%
0 243409
 
3.4%
6 227222
 
3.2%
7 227024
 
3.2%
8 217953
 
3.1%
u 208644
 
2.9%
Other values (64) 2316605
32.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3263423
46.0%
Lowercase Letter 1431190
20.2%
Space Separator 1261510
 
17.8%
Uppercase Letter 543907
 
7.7%
Dash Punctuation 499756
 
7.0%
Other Punctuation 92897
 
1.3%
Open Punctuation 581
 
< 0.1%
Close Punctuation 581
 
< 0.1%
Format 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 208644
14.6%
r 158708
11.1%
a 157043
11.0%
e 120723
8.4%
n 97727
 
6.8%
p 94749
 
6.6%
y 81426
 
5.7%
l 78861
 
5.5%
g 78230
 
5.5%
c 71344
 
5.0%
Other values (16) 283735
19.8%
Uppercase Letter
ValueCountFrequency (%)
J 147803
27.2%
A 124847
23.0%
M 113882
20.9%
O 43248
 
8.0%
S 39048
 
7.2%
F 26562
 
4.9%
N 26070
 
4.8%
D 18361
 
3.4%
C 3404
 
0.6%
E 144
 
< 0.1%
Other values (13) 538
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 874532
26.8%
9 688315
21.1%
2 328876
 
10.1%
0 243409
 
7.5%
6 227222
 
7.0%
7 227024
 
7.0%
8 217953
 
6.7%
3 176664
 
5.4%
5 152254
 
4.7%
4 127174
 
3.9%
Other Punctuation
ValueCountFrequency (%)
: 41924
45.1%
; 34825
37.5%
. 14987
 
16.1%
, 770
 
0.8%
/ 307
 
0.3%
' 46
 
< 0.1%
" 20
 
< 0.1%
? 18
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 580
99.8%
[ 1
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 580
99.8%
] 1
 
0.2%
Space Separator
ValueCountFrequency (%)
1261510
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 499756
100.0%
Format
ValueCountFrequency (%)
­ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5118749
72.2%
Latin 1975097
 
27.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 208644
 
10.6%
r 158708
 
8.0%
a 157043
 
8.0%
J 147803
 
7.5%
A 124847
 
6.3%
e 120723
 
6.1%
M 113882
 
5.8%
n 97727
 
4.9%
p 94749
 
4.8%
y 81426
 
4.1%
Other values (39) 669545
33.9%
Common
ValueCountFrequency (%)
1261510
24.6%
1 874532
17.1%
9 688315
13.4%
- 499756
 
9.8%
2 328876
 
6.4%
0 243409
 
4.8%
6 227222
 
4.4%
7 227024
 
4.4%
8 217953
 
4.3%
3 176664
 
3.5%
Other values (15) 373488
 
7.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7093845
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1261510
17.8%
1 874532
 
12.3%
9 688315
 
9.7%
- 499756
 
7.0%
2 328876
 
4.6%
0 243409
 
3.4%
6 227222
 
3.2%
7 227024
 
3.2%
8 217953
 
3.1%
u 208644
 
2.9%
Other values (63) 2316604
32.7%
None
ValueCountFrequency (%)
­ 1
100.0%
Distinct6286
Distinct (%)1.1%
Missing4414
Missing (%)0.8%
Memory size4.5 MiB
2025-01-08T17:56:07.104992image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length167
Median length118
Mean length48.81643259
Min length4

Characters and Unicode

Total characters28303133
Distinct characters83
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1092 ?
Unique (%)0.2%

Sample

1st rowOceania, Papua New Guinea, Central Province, Kairuku-Hiri District, New Guinea
2nd rowNorth America, United States, North Carolina, Buncombe - Yancey
3rd rowOceania, Pacific Ocean , Tonga, Tonga Islands, Tongatapu Island Group, Tonga Islands
4th rowNorth America, Grenada, St. George Parish, Lesser Antilles, Windward Islands, Grenada Island
5th rowNorth America, United States, Virginia, Augusta
ValueCountFrequency (%)
america 483266
 
12.9%
north 476209
 
12.7%
states 351020
 
9.4%
united 349359
 
9.4%
virginia 96173
 
2.6%
south 71896
 
1.9%
islands 71471
 
1.9%
carolina 61728
 
1.7%
54664
 
1.5%
asia 39306
 
1.1%
Other values (4622) 1680221
45.0%
2025-01-08T17:56:07.354080image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3155526
 
11.1%
a 2668328
 
9.4%
i 2176293
 
7.7%
e 2119779
 
7.5%
t 1973350
 
7.0%
r 1844062
 
6.5%
, 1669519
 
5.9%
n 1511861
 
5.3%
o 1298349
 
4.6%
s 1011828
 
3.6%
Other values (73) 8874238
31.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 19740985
69.7%
Uppercase Letter 3656252
 
12.9%
Space Separator 3155526
 
11.1%
Other Punctuation 1685470
 
6.0%
Dash Punctuation 42316
 
0.1%
Open Punctuation 11057
 
< 0.1%
Close Punctuation 11052
 
< 0.1%
Math Symbol 409
 
< 0.1%
Decimal Number 64
 
< 0.1%
Modifier Letter 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2668328
13.5%
i 2176293
11.0%
e 2119779
10.7%
t 1973350
10.0%
r 1844062
9.3%
n 1511861
7.7%
o 1298349
 
6.6%
s 1011828
 
5.1%
c 896985
 
4.5%
h 740667
 
3.8%
Other values (28) 3499483
17.7%
Uppercase Letter
ValueCountFrequency (%)
A 644681
17.6%
N 528051
14.4%
S 527098
14.4%
U 359922
9.8%
P 226035
 
6.2%
C 185016
 
5.1%
M 170358
 
4.7%
I 135691
 
3.7%
V 116151
 
3.2%
G 115778
 
3.2%
Other values (18) 647471
17.7%
Other Punctuation
ValueCountFrequency (%)
, 1669519
99.1%
. 13671
 
0.8%
' 2228
 
0.1%
? 41
 
< 0.1%
/ 11
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 42077
99.4%
239
 
0.6%
Open Punctuation
ValueCountFrequency (%)
( 10381
93.9%
[ 676
 
6.1%
Close Punctuation
ValueCountFrequency (%)
) 10376
93.9%
] 676
 
6.1%
Math Symbol
ValueCountFrequency (%)
= 389
95.1%
+ 20
 
4.9%
Decimal Number
ValueCountFrequency (%)
1 32
50.0%
0 32
50.0%
Space Separator
ValueCountFrequency (%)
3155526
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23397237
82.7%
Common 4905896
 
17.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2668328
 
11.4%
i 2176293
 
9.3%
e 2119779
 
9.1%
t 1973350
 
8.4%
r 1844062
 
7.9%
n 1511861
 
6.5%
o 1298349
 
5.5%
s 1011828
 
4.3%
c 896985
 
3.8%
h 740667
 
3.2%
Other values (56) 7155735
30.6%
Common
ValueCountFrequency (%)
3155526
64.3%
, 1669519
34.0%
- 42077
 
0.9%
. 13671
 
0.3%
( 10381
 
0.2%
) 10376
 
0.2%
' 2228
 
< 0.1%
[ 676
 
< 0.1%
] 676
 
< 0.1%
= 389
 
< 0.1%
Other values (7) 377
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 28276056
99.9%
None 26786
 
0.1%
Punctuation 239
 
< 0.1%
Latin Ext Additional 50
 
< 0.1%
Modifier Letters 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3155526
 
11.2%
a 2668328
 
9.4%
i 2176293
 
7.7%
e 2119779
 
7.5%
t 1973350
 
7.0%
r 1844062
 
6.5%
, 1669519
 
5.9%
n 1511861
 
5.3%
o 1298349
 
4.6%
s 1011828
 
3.6%
Other values (57) 8847161
31.3%
None
ValueCountFrequency (%)
é 6953
26.0%
á 5925
22.1%
ã 4537
16.9%
í 4305
16.1%
ó 3223
12.0%
ô 1182
 
4.4%
ñ 439
 
1.6%
â 51
 
0.2%
Đ 50
 
0.2%
ı 48
 
0.2%
Other values (3) 73
 
0.3%
Punctuation
ValueCountFrequency (%)
239
100.0%
Latin Ext Additional
ValueCountFrequency (%)
50
100.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 2
100.0%

continent
Text

Missing 

Distinct6
Distinct (%)< 0.1%
Missing10069
Missing (%)1.7%
Memory size4.5 MiB
2025-01-08T17:56:07.412080image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length11.7863662
Min length4

Characters and Unicode

Total characters6766930
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOCEANIA
2nd rowNORTH_AMERICA
3rd rowOCEANIA
4th rowNORTH_AMERICA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 416962
72.6%
south_america 64731
 
11.3%
asia 39723
 
6.9%
oceania 29733
 
5.2%
africa 20601
 
3.6%
europe 2382
 
0.4%
2025-01-08T17:56:07.514339image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 1143500
16.9%
R 921638
13.6%
I 571750
8.4%
C 532027
7.9%
E 516190
7.6%
O 513808
7.6%
T 481693
7.1%
H 481693
7.1%
_ 481693
7.1%
M 481693
7.1%
Other values (5) 641245
9.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 6285237
92.9%
Connector Punctuation 481693
 
7.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 1143500
18.2%
R 921638
14.7%
I 571750
9.1%
C 532027
8.5%
E 516190
8.2%
O 513808
8.2%
T 481693
7.7%
H 481693
7.7%
M 481693
7.7%
N 446695
 
7.1%
Other values (4) 194550
 
3.1%
Connector Punctuation
ValueCountFrequency (%)
_ 481693
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6285237
92.9%
Common 481693
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 1143500
18.2%
R 921638
14.7%
I 571750
9.1%
C 532027
8.5%
E 516190
8.2%
O 513808
8.2%
T 481693
7.7%
H 481693
7.7%
M 481693
7.7%
N 446695
 
7.1%
Other values (4) 194550
 
3.1%
Common
ValueCountFrequency (%)
_ 481693
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6766930
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 1143500
16.9%
R 921638
13.6%
I 571750
8.4%
C 532027
7.9%
E 516190
7.6%
O 513808
7.6%
T 481693
7.1%
H 481693
7.1%
_ 481693
7.1%
M 481693
7.1%
Other values (5) 641245
9.5%

waterBody
Text

Missing 

Distinct3
Distinct (%)< 0.1%
Missing555994
Missing (%)95.2%
Memory size4.5 MiB
2025-01-08T17:56:07.560053image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length13
Mean length12.96972383
Min length12

Characters and Unicode

Total characters365837
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPacific Ocean
2nd rowPacific Ocean
3rd rowPacific Ocean
4th rowPacific Ocean
5th rowIndian Ocean
ValueCountFrequency (%)
ocean 28207
50.0%
pacific 26665
47.3%
indian 1198
 
2.1%
atlantic 344
 
0.6%
2025-01-08T17:56:07.664199image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 81881
22.4%
a 56414
15.4%
i 54872
15.0%
n 30947
 
8.5%
28207
 
7.7%
O 28207
 
7.7%
e 28207
 
7.7%
P 26665
 
7.3%
f 26665
 
7.3%
I 1198
 
0.3%
Other values (4) 2574
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 281216
76.9%
Uppercase Letter 56414
 
15.4%
Space Separator 28207
 
7.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 81881
29.1%
a 56414
20.1%
i 54872
19.5%
n 30947
 
11.0%
e 28207
 
10.0%
f 26665
 
9.5%
d 1198
 
0.4%
t 688
 
0.2%
l 344
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
O 28207
50.0%
P 26665
47.3%
I 1198
 
2.1%
A 344
 
0.6%
Space Separator
ValueCountFrequency (%)
28207
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 337630
92.3%
Common 28207
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 81881
24.3%
a 56414
16.7%
i 54872
16.3%
n 30947
 
9.2%
O 28207
 
8.4%
e 28207
 
8.4%
P 26665
 
7.9%
f 26665
 
7.9%
I 1198
 
0.4%
d 1198
 
0.4%
Other values (3) 1376
 
0.4%
Common
ValueCountFrequency (%)
28207
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 365837
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 81881
22.4%
a 56414
15.4%
i 54872
15.0%
n 30947
 
8.5%
28207
 
7.7%
O 28207
 
7.7%
e 28207
 
7.7%
P 26665
 
7.3%
f 26665
 
7.3%
I 1198
 
0.3%
Other values (4) 2574
 
0.7%

islandGroup
Text

Missing 

Distinct41
Distinct (%)0.2%
Missing564324
Missing (%)96.6%
Memory size4.5 MiB
2025-01-08T17:56:07.736565image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length25
Mean length13.3327967
Min length10

Characters and Unicode

Total characters265016
Distinct characters45
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st rowWindward Islands
2nd rowVirgin Islands
3rd rowHispaniola
4th rowHispaniola
5th rowGreater Sunda Islands
ValueCountFrequency (%)
islands 10225
31.0%
hispaniola 8927
27.1%
virgin 2527
 
7.7%
windward 2377
 
7.2%
bahama 1504
 
4.6%
leeward 1357
 
4.1%
sunda 1019
 
3.1%
greater 1018
 
3.1%
northern 671
 
2.0%
solomon 655
 
2.0%
Other values (48) 2663
 
8.1%
2025-01-08T17:56:07.865112image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 41073
15.5%
s 30081
11.4%
n 27407
10.3%
i 26949
10.2%
l 20663
 
7.8%
d 17902
 
6.8%
13066
 
4.9%
o 12195
 
4.6%
r 10747
 
4.1%
I 10283
 
3.9%
Other values (35) 54650
20.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 218943
82.6%
Uppercase Letter 32978
 
12.4%
Space Separator 13066
 
4.9%
Open Punctuation 8
 
< 0.1%
Math Symbol 8
 
< 0.1%
Close Punctuation 8
 
< 0.1%
Other Punctuation 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 41073
18.8%
s 30081
13.7%
n 27407
12.5%
i 26949
12.3%
l 20663
9.4%
d 17902
8.2%
o 12195
 
5.6%
r 10747
 
4.9%
p 9142
 
4.2%
e 5828
 
2.7%
Other values (13) 16956
7.7%
Uppercase Letter
ValueCountFrequency (%)
I 10283
31.2%
H 8927
27.1%
V 2533
 
7.7%
W 2377
 
7.2%
S 1736
 
5.3%
B 1647
 
5.0%
L 1372
 
4.2%
G 1061
 
3.2%
C 934
 
2.8%
N 775
 
2.4%
Other values (7) 1333
 
4.0%
Space Separator
ValueCountFrequency (%)
13066
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%
Math Symbol
ValueCountFrequency (%)
= 8
100.0%
Close Punctuation
ValueCountFrequency (%)
) 8
100.0%
Other Punctuation
ValueCountFrequency (%)
. 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 251921
95.1%
Common 13095
 
4.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 41073
16.3%
s 30081
11.9%
n 27407
10.9%
i 26949
10.7%
l 20663
8.2%
d 17902
 
7.1%
o 12195
 
4.8%
r 10747
 
4.3%
I 10283
 
4.1%
p 9142
 
3.6%
Other values (30) 45479
18.1%
Common
ValueCountFrequency (%)
13066
99.8%
( 8
 
0.1%
= 8
 
0.1%
) 8
 
0.1%
. 5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 265016
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 41073
15.5%
s 30081
11.4%
n 27407
10.3%
i 26949
10.2%
l 20663
 
7.8%
d 17902
 
6.8%
13066
 
4.9%
o 12195
 
4.6%
r 10747
 
4.1%
I 10283
 
3.9%
Other values (35) 54650
20.6%

island
Text

Missing 

Distinct39
Distinct (%)0.5%
Missing576136
Missing (%)98.6%
Memory size4.5 MiB
2025-01-08T17:56:07.939435image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length10
Mean length10.77445753
Min length6

Characters and Unicode

Total characters86896
Distinct characters44
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.1%

Sample

1st rowNew Guinea
2nd rowGrenada Island
3rd rowNew Guinea
4th rowNew Guinea
5th rowLittle Swan Island
ValueCountFrequency (%)
new 4350
29.0%
guinea 4350
29.0%
island 1306
 
8.7%
borneo 712
 
4.7%
bougainville 652
 
4.3%
sumatra 558
 
3.7%
okinawa 493
 
3.3%
grenada 267
 
1.8%
isla 258
 
1.7%
swan 241
 
1.6%
Other values (44) 1803
12.0%
2025-01-08T17:56:08.075035image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 11374
13.1%
a 10388
12.0%
n 8928
10.3%
6925
 
8.0%
i 6731
 
7.7%
u 5716
 
6.6%
w 5086
 
5.9%
G 4959
 
5.7%
N 4459
 
5.1%
l 3060
 
3.5%
Other values (34) 19270
22.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 65206
75.0%
Uppercase Letter 14765
 
17.0%
Space Separator 6925
 
8.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 11374
17.4%
a 10388
15.9%
n 8928
13.7%
i 6731
10.3%
u 5716
8.8%
w 5086
7.8%
l 3060
 
4.7%
o 2768
 
4.2%
d 2350
 
3.6%
s 2071
 
3.2%
Other values (14) 6734
10.3%
Uppercase Letter
ValueCountFrequency (%)
G 4959
33.6%
N 4459
30.2%
I 1683
 
11.4%
B 1407
 
9.5%
S 841
 
5.7%
O 512
 
3.5%
U 199
 
1.3%
K 190
 
1.3%
L 178
 
1.2%
R 151
 
1.0%
Other values (9) 186
 
1.3%
Space Separator
ValueCountFrequency (%)
6925
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 79971
92.0%
Common 6925
 
8.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 11374
14.2%
a 10388
13.0%
n 8928
11.2%
i 6731
8.4%
u 5716
 
7.1%
w 5086
 
6.4%
G 4959
 
6.2%
N 4459
 
5.6%
l 3060
 
3.8%
o 2768
 
3.5%
Other values (33) 16502
20.6%
Common
ValueCountFrequency (%)
6925
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 86748
99.8%
None 148
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 11374
13.1%
a 10388
12.0%
n 8928
10.3%
6925
 
8.0%
i 6731
 
7.8%
u 5716
 
6.6%
w 5086
 
5.9%
G 4959
 
5.7%
N 4459
 
5.1%
l 3060
 
3.5%
Other values (33) 19122
22.0%
None
ValueCountFrequency (%)
á 148
100.0%

countryCode
Text

Missing 

Distinct198
Distinct (%)< 0.1%
Missing10837
Missing (%)1.9%
Memory size4.5 MiB
2025-01-08T17:56:08.228811image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1146728
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st rowPG
2nd rowUS
3rd rowTO
4th rowGD
5th rowUS
ValueCountFrequency (%)
us 334216
58.3%
mx 22787
 
4.0%
ec 16235
 
2.8%
br 14722
 
2.6%
pe 12875
 
2.2%
ph 11392
 
2.0%
hn 10938
 
1.9%
pa 7718
 
1.3%
jm 7293
 
1.3%
gu 5665
 
1.0%
Other values (188) 129523
 
22.6%
2025-01-08T17:56:08.432785image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 348407
30.4%
S 342999
29.9%
P 50905
 
4.4%
M 48558
 
4.2%
C 41150
 
3.6%
E 38991
 
3.4%
H 32407
 
2.8%
G 25767
 
2.2%
R 24403
 
2.1%
X 22787
 
2.0%
Other values (16) 170354
14.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1146728
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 348407
30.4%
S 342999
29.9%
P 50905
 
4.4%
M 48558
 
4.2%
C 41150
 
3.6%
E 38991
 
3.4%
H 32407
 
2.8%
G 25767
 
2.2%
R 24403
 
2.1%
X 22787
 
2.0%
Other values (16) 170354
14.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 1146728
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 348407
30.4%
S 342999
29.9%
P 50905
 
4.4%
M 48558
 
4.2%
C 41150
 
3.6%
E 38991
 
3.4%
H 32407
 
2.8%
G 25767
 
2.2%
R 24403
 
2.1%
X 22787
 
2.0%
Other values (16) 170354
14.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1146728
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 348407
30.4%
S 342999
29.9%
P 50905
 
4.4%
M 48558
 
4.2%
C 41150
 
3.6%
E 38991
 
3.4%
H 32407
 
2.8%
G 25767
 
2.2%
R 24403
 
2.1%
X 22787
 
2.0%
Other values (16) 170354
14.9%

stateProvince
Text

Missing 

Distinct2059
Distinct (%)0.4%
Missing17001
Missing (%)2.9%
Memory size4.5 MiB
2025-01-08T17:56:08.613474image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length69
Median length52
Mean length10.58665021
Min length3

Characters and Unicode

Total characters6004748
Distinct characters72
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique356 ?
Unique (%)0.1%

Sample

1st rowCentral Province
2nd rowNorth Carolina
3rd rowTonga Islands
4th rowSt. George Parish
5th rowVirginia
ValueCountFrequency (%)
virginia 93314
 
11.0%
carolina 61709
 
7.2%
north 57614
 
6.8%
maryland 32649
 
3.8%
province 27443
 
3.2%
pennsylvania 18911
 
2.2%
west 18140
 
2.1%
florida 18100
 
2.1%
island 18015
 
2.1%
tennessee 17444
 
2.0%
Other values (1937) 487863
57.3%
2025-01-08T17:56:08.878834image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 826291
13.8%
i 632216
 
10.5%
n 557794
 
9.3%
r 474453
 
7.9%
o 407390
 
6.8%
e 304504
 
5.1%
284002
 
4.7%
l 264922
 
4.4%
s 256173
 
4.3%
t 191100
 
3.2%
Other values (62) 1805903
30.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4862616
81.0%
Uppercase Letter 830502
 
13.8%
Space Separator 284002
 
4.7%
Dash Punctuation 16262
 
0.3%
Other Punctuation 9979
 
0.2%
Open Punctuation 537
 
< 0.1%
Close Punctuation 532
 
< 0.1%
Math Symbol 318
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 826291
17.0%
i 632216
13.0%
n 557794
11.5%
r 474453
9.8%
o 407390
8.4%
e 304504
 
6.3%
l 264922
 
5.4%
s 256173
 
5.3%
t 191100
 
3.9%
g 142256
 
2.9%
Other values (24) 805517
16.6%
Uppercase Letter
ValueCountFrequency (%)
C 108499
13.1%
V 99187
11.9%
P 90787
10.9%
N 84504
10.2%
M 71209
8.6%
I 44333
 
5.3%
S 44167
 
5.3%
T 42704
 
5.1%
G 35681
 
4.3%
A 34622
 
4.2%
Other values (17) 174809
21.0%
Other Punctuation
ValueCountFrequency (%)
. 9193
92.1%
' 757
 
7.6%
? 19
 
0.2%
/ 6
 
0.1%
, 4
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
= 298
93.7%
+ 20
 
6.3%
Space Separator
ValueCountFrequency (%)
284002
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 16262
100.0%
Open Punctuation
ValueCountFrequency (%)
( 537
100.0%
Close Punctuation
ValueCountFrequency (%)
) 532
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5693118
94.8%
Common 311630
 
5.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 826291
14.5%
i 632216
 
11.1%
n 557794
 
9.8%
r 474453
 
8.3%
o 407390
 
7.2%
e 304504
 
5.3%
l 264922
 
4.7%
s 256173
 
4.5%
t 191100
 
3.4%
g 142256
 
2.5%
Other values (51) 1636019
28.7%
Common
ValueCountFrequency (%)
284002
91.1%
- 16262
 
5.2%
. 9193
 
2.9%
' 757
 
0.2%
( 537
 
0.2%
) 532
 
0.2%
= 298
 
0.1%
+ 20
 
< 0.1%
? 19
 
< 0.1%
/ 6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5984875
99.7%
None 19873
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 826291
13.8%
i 632216
 
10.6%
n 557794
 
9.3%
r 474453
 
7.9%
o 407390
 
6.8%
e 304504
 
5.1%
284002
 
4.7%
l 264922
 
4.4%
s 256173
 
4.3%
t 191100
 
3.2%
Other values (53) 1786030
29.8%
None
ValueCountFrequency (%)
á 4907
24.7%
é 4585
23.1%
ã 3690
18.6%
ó 2908
14.6%
í 2325
11.7%
ô 1036
 
5.2%
ñ 367
 
1.8%
ı 48
 
0.2%
Î 7
 
< 0.1%

county
Text

Missing 

Distinct3056
Distinct (%)0.8%
Missing191557
Missing (%)32.8%
Memory size4.5 MiB
2025-01-08T17:56:09.071660image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length56
Median length43
Mean length9.394395432
Min length3

Characters and Unicode

Total characters3688653
Distinct characters74
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique504 ?
Unique (%)0.1%

Sample

1st rowKairuku-Hiri District
2nd rowBuncombe - Yancey
3rd rowTongatapu Island Group
4th rowAugusta
5th rowElko
ValueCountFrequency (%)
21119
 
3.8%
island 14180
 
2.6%
swain 12742
 
2.3%
city 8568
 
1.6%
province 8458
 
1.5%
giles 8024
 
1.5%
frederick 7508
 
1.4%
macon 7377
 
1.3%
municipality 7367
 
1.3%
haywood 7297
 
1.3%
Other values (2826) 448585
81.4%
2025-01-08T17:56:09.326992image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 361375
 
9.8%
e 318401
 
8.6%
n 281913
 
7.6%
o 250126
 
6.8%
i 237836
 
6.4%
r 221961
 
6.0%
l 181195
 
4.9%
158581
 
4.3%
s 154891
 
4.2%
t 142082
 
3.9%
Other values (64) 1380292
37.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2956891
80.2%
Uppercase Letter 526246
 
14.3%
Space Separator 158581
 
4.3%
Dash Punctuation 25865
 
0.7%
Close Punctuation 7839
 
0.2%
Open Punctuation 7839
 
0.2%
Other Punctuation 5243
 
0.1%
Math Symbol 83
 
< 0.1%
Decimal Number 64
 
< 0.1%
Modifier Letter 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 361375
12.2%
e 318401
10.8%
n 281913
9.5%
o 250126
 
8.5%
i 237836
 
8.0%
r 221961
 
7.5%
l 181195
 
6.1%
s 154891
 
5.2%
t 142082
 
4.8%
c 111847
 
3.8%
Other values (25) 695264
23.5%
Uppercase Letter
ValueCountFrequency (%)
M 56403
 
10.7%
S 49852
 
9.5%
C 48154
 
9.2%
P 46114
 
8.8%
G 36649
 
7.0%
B 29767
 
5.7%
I 27305
 
5.2%
A 26724
 
5.1%
H 24796
 
4.7%
R 21003
 
4.0%
Other values (15) 159479
30.3%
Other Punctuation
ValueCountFrequency (%)
. 3451
65.8%
' 1471
28.1%
, 294
 
5.6%
? 22
 
0.4%
/ 5
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 25626
99.1%
239
 
0.9%
Decimal Number
ValueCountFrequency (%)
1 32
50.0%
0 32
50.0%
Space Separator
ValueCountFrequency (%)
158581
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7839
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7839
100.0%
Math Symbol
ValueCountFrequency (%)
= 83
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3483137
94.4%
Common 205516
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 361375
 
10.4%
e 318401
 
9.1%
n 281913
 
8.1%
o 250126
 
7.2%
i 237836
 
6.8%
r 221961
 
6.4%
l 181195
 
5.2%
s 154891
 
4.4%
t 142082
 
4.1%
c 111847
 
3.2%
Other values (50) 1221510
35.1%
Common
ValueCountFrequency (%)
158581
77.2%
- 25626
 
12.5%
) 7839
 
3.8%
( 7839
 
3.8%
. 3451
 
1.7%
' 1471
 
0.7%
, 294
 
0.1%
239
 
0.1%
= 83
 
< 0.1%
1 32
 
< 0.1%
Other values (4) 61
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3684587
99.9%
None 3825
 
0.1%
Punctuation 239
 
< 0.1%
Modifier Letters 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 361375
 
9.8%
e 318401
 
8.6%
n 281913
 
7.7%
o 250126
 
6.8%
i 237836
 
6.5%
r 221961
 
6.0%
l 181195
 
4.9%
158581
 
4.3%
s 154891
 
4.2%
t 142082
 
3.9%
Other values (53) 1376226
37.4%
None
ValueCountFrequency (%)
é 1444
37.8%
í 911
23.8%
á 870
22.7%
ó 315
 
8.2%
ô 96
 
2.5%
ñ 72
 
1.9%
â 51
 
1.3%
ü 38
 
1.0%
è 28
 
0.7%
Punctuation
ValueCountFrequency (%)
239
100.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 2
100.0%
Distinct56242
Distinct (%)9.7%
Missing2303
Missing (%)0.4%
Memory size4.5 MiB
2025-01-08T17:56:09.523299image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length295
Median length193
Mean length54.40064066
Min length2

Characters and Unicode

Total characters31655624
Distinct characters110
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24789 ?
Unique (%)4.3%

Sample

1st rowKairuku, Yule Island
2nd rowPisgah National Forest, near Cane River Gap
3rd rowNo Locality Data
4th rowTongatapu Island, adjacent to Fua'amotu Airport
5th rowGrand Anse Bay, west end of, along road to jetty just east of base of Quarantine Point
ValueCountFrequency (%)
of 456712
 
8.0%
mi 190409
 
3.3%
road 182915
 
3.2%
route 156226
 
2.7%
on 147202
 
2.6%
national 106083
 
1.8%
by 93415
 
1.6%
forest 89661
 
1.6%
junction 81776
 
1.4%
km 68711
 
1.2%
Other values (30771) 4165761
72.6%
2025-01-08T17:56:09.789026image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5156973
16.3%
a 2389818
 
7.5%
o 2384544
 
7.5%
e 1748111
 
5.5%
n 1666496
 
5.3%
i 1568945
 
5.0%
t 1523016
 
4.8%
r 1291111
 
4.1%
l 964589
 
3.0%
, 845140
 
2.7%
Other values (100) 12116881
38.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 19646027
62.1%
Space Separator 5156973
 
16.3%
Uppercase Letter 3980245
 
12.6%
Other Punctuation 1240931
 
3.9%
Decimal Number 1169470
 
3.7%
Open Punctuation 200092
 
0.6%
Close Punctuation 200069
 
0.6%
Dash Punctuation 36149
 
0.1%
Math Symbol 25534
 
0.1%
Format 126
 
< 0.1%
Other values (2) 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2389818
12.2%
o 2384544
12.1%
e 1748111
 
8.9%
n 1666496
 
8.5%
i 1568945
 
8.0%
t 1523016
 
7.8%
r 1291111
 
6.6%
l 964589
 
4.9%
u 736911
 
3.8%
s 725961
 
3.7%
Other values (38) 4646525
23.7%
Uppercase Letter
ValueCountFrequency (%)
R 444979
 
11.2%
S 427726
 
10.7%
N 377809
 
9.5%
C 256106
 
6.4%
M 232345
 
5.8%
E 225035
 
5.7%
W 219583
 
5.5%
P 208932
 
5.2%
F 190052
 
4.8%
A 187582
 
4.7%
Other values (18) 1210096
30.4%
Other Punctuation
ValueCountFrequency (%)
, 845140
68.1%
. 367701
29.6%
' 12150
 
1.0%
; 6267
 
0.5%
/ 6172
 
0.5%
" 1760
 
0.1%
: 693
 
0.1%
? 661
 
0.1%
# 351
 
< 0.1%
& 36
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 222198
19.0%
0 175576
15.0%
2 157851
13.5%
5 124586
10.7%
3 116675
10.0%
6 105408
9.0%
4 93349
8.0%
7 69091
 
5.9%
8 55301
 
4.7%
9 49435
 
4.2%
Open Punctuation
ValueCountFrequency (%)
( 200029
> 99.9%
[ 62
 
< 0.1%
1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
= 24365
95.4%
+ 1165
 
4.6%
< 4
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 200007
> 99.9%
] 62
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 36140
> 99.9%
9
 
< 0.1%
Space Separator
ValueCountFrequency (%)
5156973
100.0%
Format
ValueCountFrequency (%)
­ 126
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 7
100.0%
Control
ValueCountFrequency (%)
 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 23626272
74.6%
Common 8029352
 
25.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2389818
 
10.1%
o 2384544
 
10.1%
e 1748111
 
7.4%
n 1666496
 
7.1%
i 1568945
 
6.6%
t 1523016
 
6.4%
r 1291111
 
5.5%
l 964589
 
4.1%
u 736911
 
3.1%
s 725961
 
3.1%
Other values (66) 8626770
36.5%
Common
ValueCountFrequency (%)
5156973
64.2%
, 845140
 
10.5%
. 367701
 
4.6%
1 222198
 
2.8%
( 200029
 
2.5%
) 200007
 
2.5%
0 175576
 
2.2%
2 157851
 
2.0%
5 124586
 
1.6%
3 116675
 
1.5%
Other values (24) 462616
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31626455
99.9%
None 29146
 
0.1%
Latin Ext Additional 13
 
< 0.1%
Punctuation 10
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5156973
16.3%
a 2389818
 
7.6%
o 2384544
 
7.5%
e 1748111
 
5.5%
n 1666496
 
5.3%
i 1568945
 
5.0%
t 1523016
 
4.8%
r 1291111
 
4.1%
l 964589
 
3.0%
, 845140
 
2.7%
Other values (73) 12087712
38.2%
None
ValueCountFrequency (%)
í 24109
82.7%
é 1678
 
5.8%
á 1098
 
3.8%
ñ 788
 
2.7%
â 452
 
1.6%
ó 240
 
0.8%
ú 196
 
0.7%
ô 169
 
0.6%
­ 126
 
0.4%
è 59
 
0.2%
Other values (12) 231
 
0.8%
Latin Ext Additional
ValueCountFrequency (%)
9
69.2%
2
 
15.4%
2
 
15.4%
Punctuation
ValueCountFrequency (%)
9
90.0%
1
 
10.0%

verbatimElevation
Text

Missing 

Distinct2882
Distinct (%)1.1%
Missing331608
Missing (%)56.8%
Memory size4.5 MiB
2025-01-08T17:56:09.964309image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length93
Median length46
Mean length7.093015246
Min length3

Characters and Unicode

Total characters1791646
Distinct characters57
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique530 ?
Unique (%)0.2%

Sample

1st row4320 ft
2nd row4351 ft
3rd row2200 m
4th row30-50 m
5th row30 ft
ValueCountFrequency (%)
ft 191831
36.8%
m 59860
 
11.5%
ca 13358
 
2.6%
1100-1350 4058
 
0.8%
200 3781
 
0.7%
10 3450
 
0.7%
3400 2848
 
0.5%
3500 2819
 
0.5%
20 2706
 
0.5%
3600 2513
 
0.5%
Other values (2009) 234300
44.9%
2025-01-08T17:56:10.203579image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 376273
21.0%
268931
15.0%
t 192412
10.7%
f 192004
10.7%
1 99566
 
5.6%
3 96808
 
5.4%
2 90988
 
5.1%
4 83319
 
4.7%
5 76675
 
4.3%
m 59946
 
3.3%
Other values (47) 254724
14.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 994929
55.5%
Lowercase Letter 481690
26.9%
Space Separator 268931
 
15.0%
Dash Punctuation 30052
 
1.7%
Other Punctuation 13757
 
0.8%
Close Punctuation 1006
 
0.1%
Open Punctuation 1006
 
0.1%
Math Symbol 195
 
< 0.1%
Uppercase Letter 80
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 192412
39.9%
f 192004
39.9%
m 59946
 
12.4%
a 14859
 
3.1%
c 13366
 
2.8%
e 3277
 
0.7%
l 1590
 
0.3%
v 1058
 
0.2%
s 835
 
0.2%
o 611
 
0.1%
Other values (15) 1732
 
0.4%
Decimal Number
ValueCountFrequency (%)
0 376273
37.8%
1 99566
 
10.0%
3 96808
 
9.7%
2 90988
 
9.1%
4 83319
 
8.4%
5 76675
 
7.7%
6 59540
 
6.0%
8 45372
 
4.6%
7 38030
 
3.8%
9 28358
 
2.9%
Uppercase Letter
ValueCountFrequency (%)
C 23
28.7%
S 15
18.8%
P 12
15.0%
G 12
15.0%
A 10
12.5%
D 5
 
6.2%
L 2
 
2.5%
M 1
 
1.2%
Other Punctuation
ValueCountFrequency (%)
. 13576
98.7%
, 90
 
0.7%
/ 39
 
0.3%
; 22
 
0.2%
? 22
 
0.2%
' 6
 
< 0.1%
2
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
< 110
56.4%
+ 75
38.5%
= 10
 
5.1%
Space Separator
ValueCountFrequency (%)
268931
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 30052
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1006
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1006
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1309876
73.1%
Latin 481770
 
26.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 192412
39.9%
f 192004
39.9%
m 59946
 
12.4%
a 14859
 
3.1%
c 13366
 
2.8%
e 3277
 
0.7%
l 1590
 
0.3%
v 1058
 
0.2%
s 835
 
0.2%
o 611
 
0.1%
Other values (23) 1812
 
0.4%
Common
ValueCountFrequency (%)
0 376273
28.7%
268931
20.5%
1 99566
 
7.6%
3 96808
 
7.4%
2 90988
 
6.9%
4 83319
 
6.4%
5 76675
 
5.9%
6 59540
 
4.5%
8 45372
 
3.5%
7 38030
 
2.9%
Other values (14) 74374
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1791644
> 99.9%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 376273
21.0%
268931
15.0%
t 192412
10.7%
f 192004
10.7%
1 99566
 
5.6%
3 96808
 
5.4%
2 90988
 
5.1%
4 83319
 
4.7%
5 76675
 
4.3%
m 59946
 
3.3%
Other values (46) 254722
14.2%
Punctuation
ValueCountFrequency (%)
2
100.0%

decimalLatitude
Text

Missing 

Distinct24490
Distinct (%)5.8%
Missing162667
Missing (%)27.8%
Memory size4.5 MiB
2025-01-08T17:56:10.406227image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length7
Mean length6.90323675
Min length3

Characters and Unicode

Total characters2909949
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8226 ?
Unique (%)2.0%

Sample

1st row-8.8201
2nd row35.8083
3rd row12.0217
4th row38.39
5th row40.9580375
ValueCountFrequency (%)
39.6306 4296
 
1.0%
13.6389 2247
 
0.5%
39.8872 1888
 
0.4%
12.83 1754
 
0.4%
26.9844 1718
 
0.4%
4.0147 1664
 
0.4%
37.4161 1535
 
0.4%
36.7631 1511
 
0.4%
25.4017 1483
 
0.4%
36.9486 1468
 
0.3%
Other values (24041) 401970
95.4%
2025-01-08T17:56:10.677377image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 480691
16.5%
. 421534
14.5%
1 245474
8.4%
6 236090
8.1%
8 232386
8.0%
4 230710
7.9%
5 230378
7.9%
7 210588
7.2%
2 210386
7.2%
9 198067
6.8%
Other values (3) 213645
7.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2429298
83.5%
Other Punctuation 421534
 
14.5%
Dash Punctuation 59054
 
2.0%
Uppercase Letter 63
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 480691
19.8%
1 245474
10.1%
6 236090
9.7%
8 232386
9.6%
4 230710
9.5%
5 230378
9.5%
7 210588
8.7%
2 210386
8.7%
9 198067
8.2%
0 154528
 
6.4%
Other Punctuation
ValueCountFrequency (%)
. 421534
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 59054
100.0%
Uppercase Letter
ValueCountFrequency (%)
E 63
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2909886
> 99.9%
Latin 63
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
3 480691
16.5%
. 421534
14.5%
1 245474
8.4%
6 236090
8.1%
8 232386
8.0%
4 230710
7.9%
5 230378
7.9%
7 210588
7.2%
2 210386
7.2%
9 198067
6.8%
Other values (2) 213582
7.3%
Latin
ValueCountFrequency (%)
E 63
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2909949
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 480691
16.5%
. 421534
14.5%
1 245474
8.4%
6 236090
8.1%
8 232386
8.0%
4 230710
7.9%
5 230378
7.9%
7 210588
7.2%
2 210386
7.2%
9 198067
6.8%
Other values (3) 213645
7.3%

decimalLongitude
Text

Missing 

Distinct24797
Distinct (%)5.9%
Missing162667
Missing (%)27.8%
Memory size4.5 MiB
2025-01-08T17:56:10.883465image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length8
Mean length7.814814463
Min length3

Characters and Unicode

Total characters3294210
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8175 ?
Unique (%)1.9%

Sample

1st row146.53
2nd row-82.3481
3rd row-61.7664
4th row-79.25
5th row-115.4346518
ValueCountFrequency (%)
77.4714 4296
 
1.0%
144.962 2247
 
0.5%
77.7786 2139
 
0.5%
87.1889 1888
 
0.4%
69.28 1763
 
0.4%
81.4919 1718
 
0.4%
80.5097 1653
 
0.4%
81.2228 1509
 
0.4%
80.6567 1483
 
0.4%
79.5561 1463
 
0.3%
Other values (24682) 401375
95.2%
2025-01-08T17:56:11.139884image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 421534
12.8%
7 386519
11.7%
- 382134
11.6%
8 368334
11.2%
1 264538
8.0%
3 247782
7.5%
6 236340
7.2%
9 222019
6.7%
4 208966
6.3%
5 204225
6.2%
Other values (2) 351819
10.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2490542
75.6%
Other Punctuation 421534
 
12.8%
Dash Punctuation 382134
 
11.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 386519
15.5%
8 368334
14.8%
1 264538
10.6%
3 247782
9.9%
6 236340
9.5%
9 222019
8.9%
4 208966
8.4%
5 204225
8.2%
2 203737
8.2%
0 148082
 
5.9%
Other Punctuation
ValueCountFrequency (%)
. 421534
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 382134
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3294210
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 421534
12.8%
7 386519
11.7%
- 382134
11.6%
8 368334
11.2%
1 264538
8.0%
3 247782
7.5%
6 236340
7.2%
9 222019
6.7%
4 208966
6.3%
5 204225
6.2%
Other values (2) 351819
10.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3294210
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 421534
12.8%
7 386519
11.7%
- 382134
11.6%
8 368334
11.2%
1 264538
8.0%
3 247782
7.5%
6 236340
7.2%
9 222019
6.7%
4 208966
6.3%
5 204225
6.2%
Other values (2) 351819
10.7%
Distinct7350
Distinct (%)5.1%
Missing439218
Missing (%)75.2%
Memory size4.5 MiB
2025-01-08T17:56:11.339711image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length8
Mean length6.368484581
Min length3

Characters and Unicode

Total characters923322
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2148 ?
Unique (%)1.5%

Sample

1st row402.34
2nd row96.56
3rd row152901.0
4th row6115.0
5th row1754.18
ValueCountFrequency (%)
347.62 1384
 
1.0%
186.68 1338
 
0.9%
4615.0 1110
 
0.8%
5615.0 1066
 
0.7%
1066.0 1030
 
0.7%
3615.0 978
 
0.7%
5115.0 953
 
0.7%
4115.0 946
 
0.7%
177.03 882
 
0.6%
402.34 826
 
0.6%
Other values (7340) 134470
92.7%
2025-01-08T17:56:11.604114image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 144983
15.7%
0 110601
12.0%
1 109273
11.8%
2 82741
9.0%
5 79271
8.6%
3 75055
8.1%
4 74981
8.1%
6 67932
7.4%
9 62051
6.7%
8 58564
6.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 778339
84.3%
Other Punctuation 144983
 
15.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 110601
14.2%
1 109273
14.0%
2 82741
10.6%
5 79271
10.2%
3 75055
9.6%
4 74981
9.6%
6 67932
8.7%
9 62051
8.0%
8 58564
7.5%
7 57870
7.4%
Other Punctuation
ValueCountFrequency (%)
. 144983
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 923322
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 144983
15.7%
0 110601
12.0%
1 109273
11.8%
2 82741
9.0%
5 79271
8.6%
3 75055
8.1%
4 74981
8.1%
6 67932
7.4%
9 62051
6.7%
8 58564
6.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 923322
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 144983
15.7%
0 110601
12.0%
1 109273
11.8%
2 82741
9.0%
5 79271
8.6%
3 75055
8.1%
4 74981
8.1%
6 67932
7.4%
9 62051
6.7%
8 58564
6.3%

georeferenceProtocol
Text

Missing 

Distinct3371
Distinct (%)2.3%
Missing439136
Missing (%)75.2%
Memory size4.5 MiB
2025-01-08T17:56:11.791540image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length302
Median length251
Mean length91.26128977
Min length3

Characters and Unicode

Total characters13238819
Distinct characters86
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique891 ?
Unique (%)0.6%

Sample

1st rowUSGS Palo Alto Quad (TopoZone - 1:24,000), MaNIS/HerpNET/ORNIS Georeferencing Guidelines
2nd rowTerrain Navigator v. 5.03 USGS 1:24,000, MaNIS/HerpNET/ORNIS Georeferencing Guidelines
3rd rowAlexandria Digital Library Gazetteer, MaNIS/HerpNET/ORNIS Georeferencing Guidelines
4th rowUSGS Chesterfield Quad (TopoZine - 1:24,000), MaNIS/HerpNET/ORNIS Georeferencing Guidelines
5th rowUSGS Falls Church Quad (TopoZone - 1:24,000), MaNIS/HerpNET/ORNIS Georeferencing Guidelines
ValueCountFrequency (%)
georeferencing 134216
 
9.7%
manis/herpnet/ornis 134163
 
9.7%
guidelines 134143
 
9.7%
usgs 59079
 
4.3%
1:24,000 54333
 
3.9%
44136
 
3.2%
quad 39827
 
2.9%
digital 22588
 
1.6%
gazetteer 22105
 
1.6%
topozone 21638
 
1.6%
Other values (3792) 715459
51.8%
2025-01-08T17:56:12.044144image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1320173
 
10.0%
1236622
 
9.3%
r 733799
 
5.5%
i 691510
 
5.2%
a 629206
 
4.8%
n 622138
 
4.7%
o 500801
 
3.8%
N 461182
 
3.5%
S 454207
 
3.4%
G 414644
 
3.1%
Other values (76) 6174537
46.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7136568
53.9%
Uppercase Letter 3060694
23.1%
Space Separator 1236622
 
9.3%
Decimal Number 835786
 
6.3%
Other Punctuation 760937
 
5.7%
Open Punctuation 71491
 
0.5%
Close Punctuation 71272
 
0.5%
Dash Punctuation 65161
 
0.5%
Connector Punctuation 248
 
< 0.1%
Math Symbol 40
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1320173
18.5%
r 733799
10.3%
i 691510
9.7%
a 629206
8.8%
n 622138
8.7%
o 500801
 
7.0%
l 307980
 
4.3%
d 294924
 
4.1%
t 258449
 
3.6%
g 250955
 
3.5%
Other values (19) 1526633
21.4%
Uppercase Letter
ValueCountFrequency (%)
N 461182
15.1%
S 454207
14.8%
G 414644
13.5%
I 303621
9.9%
T 221610
7.2%
M 189899
6.2%
E 166244
 
5.4%
O 161796
 
5.3%
R 151192
 
4.9%
H 140237
 
4.6%
Other values (17) 396062
12.9%
Other Punctuation
ValueCountFrequency (%)
/ 286534
37.7%
, 258402
34.0%
: 100996
 
13.3%
. 80708
 
10.6%
; 15057
 
2.0%
! 9034
 
1.2%
# 6647
 
0.9%
' 2637
 
0.3%
& 813
 
0.1%
? 94
 
< 0.1%
Other values (3) 15
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 379892
45.5%
1 133690
 
16.0%
2 100269
 
12.0%
4 76915
 
9.2%
5 38693
 
4.6%
7 25544
 
3.1%
9 22590
 
2.7%
6 22338
 
2.7%
3 22202
 
2.7%
8 13653
 
1.6%
Math Symbol
ValueCountFrequency (%)
+ 24
60.0%
= 16
40.0%
Space Separator
ValueCountFrequency (%)
1236622
100.0%
Open Punctuation
ValueCountFrequency (%)
( 71491
100.0%
Close Punctuation
ValueCountFrequency (%)
) 71272
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 65161
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 248
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10197262
77.0%
Common 3041557
 
23.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1320173
 
12.9%
r 733799
 
7.2%
i 691510
 
6.8%
a 629206
 
6.2%
n 622138
 
6.1%
o 500801
 
4.9%
N 461182
 
4.5%
S 454207
 
4.5%
G 414644
 
4.1%
l 307980
 
3.0%
Other values (46) 4061622
39.8%
Common
ValueCountFrequency (%)
1236622
40.7%
0 379892
 
12.5%
/ 286534
 
9.4%
, 258402
 
8.5%
1 133690
 
4.4%
: 100996
 
3.3%
2 100269
 
3.3%
. 80708
 
2.7%
4 76915
 
2.5%
( 71491
 
2.4%
Other values (20) 316038
 
10.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13234776
> 99.9%
None 4039
 
< 0.1%
Punctuation 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1320173
 
10.0%
1236622
 
9.3%
r 733799
 
5.5%
i 691510
 
5.2%
a 629206
 
4.8%
n 622138
 
4.7%
o 500801
 
3.8%
N 461182
 
3.5%
S 454207
 
3.4%
G 414644
 
3.1%
Other values (71) 6170494
46.6%
None
ValueCountFrequency (%)
í 4030
99.8%
é 5
 
0.1%
ô 2
 
< 0.1%
Î 2
 
< 0.1%
Punctuation
ValueCountFrequency (%)
4
100.0%

georeferenceRemarks
Text

Missing 

Distinct3681
Distinct (%)2.6%
Missing443625
Missing (%)75.9%
Memory size4.5 MiB
2025-01-08T17:56:12.217588image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length83
Median length55
Mean length22.53162702
Min length7

Characters and Unicode

Total characters3167406
Distinct characters64
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1057 ?
Unique (%)0.8%

Sample

1st rowLocality extent = 0.05
2nd rowLocality extent = 95
3rd rowLocality extent = 3.5
4th rowDatum Guam 63
5th rowLocality extent = 1.08
ValueCountFrequency (%)
extent 134257
22.0%
134207
22.0%
locality 134203
22.0%
mi 40072
 
6.6%
km 8736
 
1.4%
0.1 7251
 
1.2%
datum 6200
 
1.0%
63 5497
 
0.9%
guam 5494
 
0.9%
1 5323
 
0.9%
Other values (2938) 128798
21.1%
2025-01-08T17:56:12.457324image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
469462
14.8%
t 411232
13.0%
e 269464
 
8.5%
i 175099
 
5.5%
. 149589
 
4.7%
a 146541
 
4.6%
l 134689
 
4.3%
n 134567
 
4.2%
o 134447
 
4.2%
y 134376
 
4.2%
Other values (54) 1007940
31.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1896549
59.9%
Space Separator 469462
 
14.8%
Decimal Number 368654
 
11.6%
Other Punctuation 149871
 
4.7%
Uppercase Letter 148496
 
4.7%
Math Symbol 134208
 
4.2%
Dash Punctuation 72
 
< 0.1%
Open Punctuation 47
 
< 0.1%
Close Punctuation 47
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 411232
21.7%
e 269464
14.2%
i 175099
9.2%
a 146541
 
7.7%
l 134689
 
7.1%
n 134567
 
7.1%
o 134447
 
7.1%
y 134376
 
7.1%
x 134300
 
7.1%
c 134263
 
7.1%
Other values (14) 87571
 
4.6%
Uppercase Letter
ValueCountFrequency (%)
L 134266
90.4%
G 6166
 
4.2%
D 6026
 
4.1%
S 774
 
0.5%
W 687
 
0.5%
H 144
 
0.1%
N 119
 
0.1%
P 107
 
0.1%
E 71
 
< 0.1%
A 37
 
< 0.1%
Other values (9) 99
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 74829
20.3%
1 61579
16.7%
5 51221
13.9%
2 46439
12.6%
3 35147
9.5%
6 23996
 
6.5%
4 21925
 
5.9%
7 21708
 
5.9%
8 19177
 
5.2%
9 12633
 
3.4%
Other Punctuation
ValueCountFrequency (%)
. 149589
99.8%
; 174
 
0.1%
, 71
 
< 0.1%
: 19
 
< 0.1%
/ 12
 
< 0.1%
' 6
 
< 0.1%
Space Separator
ValueCountFrequency (%)
469462
100.0%
Math Symbol
ValueCountFrequency (%)
= 134208
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 72
100.0%
Open Punctuation
ValueCountFrequency (%)
( 47
100.0%
Close Punctuation
ValueCountFrequency (%)
) 47
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2045045
64.6%
Common 1122361
35.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 411232
20.1%
e 269464
13.2%
i 175099
8.6%
a 146541
 
7.2%
l 134689
 
6.6%
n 134567
 
6.6%
o 134447
 
6.6%
y 134376
 
6.6%
x 134300
 
6.6%
L 134266
 
6.6%
Other values (33) 236064
11.5%
Common
ValueCountFrequency (%)
469462
41.8%
. 149589
 
13.3%
= 134208
 
12.0%
0 74829
 
6.7%
1 61579
 
5.5%
5 51221
 
4.6%
2 46439
 
4.1%
3 35147
 
3.1%
6 23996
 
2.1%
4 21925
 
2.0%
Other values (11) 53966
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3167406
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
469462
14.8%
t 411232
13.0%
e 269464
 
8.5%
i 175099
 
5.5%
. 149589
 
4.7%
a 146541
 
4.6%
l 134689
 
4.3%
n 134567
 
4.2%
o 134447
 
4.2%
y 134376
 
4.2%
Other values (54) 1007940
31.8%
Distinct3
Distinct (%)0.7%
Missing583784
Missing (%)99.9%
Memory size4.5 MiB
2025-01-08T17:56:12.507655image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length3
Mean length3.167865707
Min length3

Characters and Unicode

Total characters1321
Distinct characters10
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowaff.
2nd rowcf.
3rd rowcf.
4th rowcf.
5th rowcf.
ValueCountFrequency (%)
cf 382
91.6%
aff 28
 
6.7%
uncertain 7
 
1.7%
2025-01-08T17:56:12.709474image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
f 438
33.2%
. 410
31.0%
c 389
29.4%
a 35
 
2.6%
n 14
 
1.1%
u 7
 
0.5%
e 7
 
0.5%
r 7
 
0.5%
t 7
 
0.5%
i 7
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 911
69.0%
Other Punctuation 410
31.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 438
48.1%
c 389
42.7%
a 35
 
3.8%
n 14
 
1.5%
u 7
 
0.8%
e 7
 
0.8%
r 7
 
0.8%
t 7
 
0.8%
i 7
 
0.8%
Other Punctuation
ValueCountFrequency (%)
. 410
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 911
69.0%
Common 410
31.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 438
48.1%
c 389
42.7%
a 35
 
3.8%
n 14
 
1.5%
u 7
 
0.8%
e 7
 
0.8%
r 7
 
0.8%
t 7
 
0.8%
i 7
 
0.8%
Common
ValueCountFrequency (%)
. 410
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1321
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 438
33.2%
. 410
31.0%
c 389
29.4%
a 35
 
2.6%
n 14
 
1.1%
u 7
 
0.5%
e 7
 
0.5%
r 7
 
0.5%
t 7
 
0.5%
i 7
 
0.5%

typeStatus
Text

Missing 

Distinct6
Distinct (%)< 0.1%
Missing571070
Missing (%)97.8%
Memory size4.5 MiB
2025-01-08T17:56:12.753366image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length8
Mean length8.014698043
Min length7

Characters and Unicode

Total characters105241
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPARATYPE
2nd rowPARATYPE
3rd rowPARATYPE
4th rowPARATYPE
5th rowPARALECTOTYPE
ValueCountFrequency (%)
paratype 10832
82.5%
holotype 1222
 
9.3%
syntype 835
 
6.4%
paralectotype 208
 
1.6%
neotype 23
 
0.2%
lectotype 11
 
0.1%
2025-01-08T17:56:12.853052image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
P 24171
23.0%
A 22080
21.0%
Y 13966
13.3%
E 13373
12.7%
T 13350
12.7%
R 11040
10.5%
O 2686
 
2.6%
L 1441
 
1.4%
H 1222
 
1.2%
N 858
 
0.8%
Other values (2) 1054
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 105241
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 24171
23.0%
A 22080
21.0%
Y 13966
13.3%
E 13373
12.7%
T 13350
12.7%
R 11040
10.5%
O 2686
 
2.6%
L 1441
 
1.4%
H 1222
 
1.2%
N 858
 
0.8%
Other values (2) 1054
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 105241
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 24171
23.0%
A 22080
21.0%
Y 13966
13.3%
E 13373
12.7%
T 13350
12.7%
R 11040
10.5%
O 2686
 
2.6%
L 1441
 
1.4%
H 1222
 
1.2%
N 858
 
0.8%
Other values (2) 1054
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 105241
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P 24171
23.0%
A 22080
21.0%
Y 13966
13.3%
E 13373
12.7%
T 13350
12.7%
R 11040
10.5%
O 2686
 
2.6%
L 1441
 
1.4%
H 1222
 
1.2%
N 858
 
0.8%
Other values (2) 1054
 
1.0%

identifiedBy
Text

Missing 

Distinct8
Distinct (%)10.5%
Missing584125
Missing (%)> 99.9%
Memory size4.5 MiB
2025-01-08T17:56:12.924713image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length122
Median length18
Mean length25.17105263
Min length14

Characters and Unicode

Total characters1913
Distinct characters49
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)5.3%

Sample

1st rowGower, David, (BMNH), The Natural History Museum (UNITED KINGDOM)
2nd rowCrombie, Ronald I.
3rd rowCrombie, Ronald I.
4th rowCrombie, Ronald I.
5th rowCrombie, Ronald I.
ValueCountFrequency (%)
ronald 56
18.7%
crombie 55
18.3%
i 55
18.3%
natural 11
 
3.7%
history 11
 
3.7%
museum 11
 
3.7%
united 11
 
3.7%
gower 10
 
3.3%
david 10
 
3.3%
bmnh 10
 
3.3%
Other values (26) 60
20.0%
2025-01-08T17:56:13.051704image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
224
 
11.7%
o 146
 
7.6%
e 102
 
5.3%
r 99
 
5.2%
, 98
 
5.1%
a 95
 
5.0%
i 87
 
4.5%
I 77
 
4.0%
n 73
 
3.8%
d 73
 
3.8%
Other values (39) 839
43.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1027
53.7%
Uppercase Letter 452
23.6%
Space Separator 224
 
11.7%
Other Punctuation 163
 
8.5%
Close Punctuation 22
 
1.2%
Open Punctuation 22
 
1.2%
Dash Punctuation 3
 
0.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 77
17.0%
R 61
13.5%
C 58
12.8%
N 43
9.5%
M 31
6.9%
D 31
6.9%
H 27
 
6.0%
G 24
 
5.3%
T 23
 
5.1%
E 14
 
3.1%
Other values (12) 63
13.9%
Lowercase Letter
ValueCountFrequency (%)
o 146
14.2%
e 102
9.9%
r 99
9.6%
a 95
9.3%
i 87
8.5%
n 73
7.1%
d 73
7.1%
l 69
6.7%
m 68
6.6%
b 56
 
5.5%
Other values (11) 159
15.5%
Other Punctuation
ValueCountFrequency (%)
, 98
60.1%
. 65
39.9%
Space Separator
ValueCountFrequency (%)
224
100.0%
Close Punctuation
ValueCountFrequency (%)
) 22
100.0%
Open Punctuation
ValueCountFrequency (%)
( 22
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1479
77.3%
Common 434
 
22.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 146
 
9.9%
e 102
 
6.9%
r 99
 
6.7%
a 95
 
6.4%
i 87
 
5.9%
I 77
 
5.2%
n 73
 
4.9%
d 73
 
4.9%
l 69
 
4.7%
m 68
 
4.6%
Other values (33) 590
39.9%
Common
ValueCountFrequency (%)
224
51.6%
, 98
22.6%
. 65
 
15.0%
) 22
 
5.1%
( 22
 
5.1%
- 3
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1913
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
224
 
11.7%
o 146
 
7.6%
e 102
 
5.3%
r 99
 
5.2%
, 98
 
5.1%
a 95
 
5.0%
i 87
 
4.5%
I 77
 
4.0%
n 73
 
3.8%
d 73
 
3.8%
Other values (39) 839
43.9%
Distinct8475
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:13.247923image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.019563472
Min length1

Characters and Unicode

Total characters4100836
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1520 ?
Unique (%)0.3%

Sample

1st row5225055
2nd row2431506
3rd row5224383
4th row2446249
5th row2467415
ValueCountFrequency (%)
2431491 75714
 
13.0%
2431539 13092
 
2.2%
2431224 10146
 
1.7%
2431506 9986
 
1.7%
2431516 8012
 
1.4%
2431529 7074
 
1.2%
2431489 6103
 
1.0%
2431484 5929
 
1.0%
2431219 4681
 
0.8%
2431510 4614
 
0.8%
Other values (8465) 438850
75.1%
2025-01-08T17:56:13.515174image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 850954
20.8%
4 741903
18.1%
1 558280
13.6%
3 457912
11.2%
5 330769
 
8.1%
9 302439
 
7.4%
6 235990
 
5.8%
8 216124
 
5.3%
7 207914
 
5.1%
0 198551
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4100836
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 850954
20.8%
4 741903
18.1%
1 558280
13.6%
3 457912
11.2%
5 330769
 
8.1%
9 302439
 
7.4%
6 235990
 
5.8%
8 216124
 
5.3%
7 207914
 
5.1%
0 198551
 
4.8%

Most occurring scripts

ValueCountFrequency (%)
Common 4100836
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 850954
20.8%
4 741903
18.1%
1 558280
13.6%
3 457912
11.2%
5 330769
 
8.1%
9 302439
 
7.4%
6 235990
 
5.8%
8 216124
 
5.3%
7 207914
 
5.1%
0 198551
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4100836
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 850954
20.8%
4 741903
18.1%
1 558280
13.6%
3 457912
11.2%
5 330769
 
8.1%
9 302439
 
7.4%
6 235990
 
5.8%
8 216124
 
5.3%
7 207914
 
5.1%
0 198551
 
4.8%
Distinct9012
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:13.709339image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length182
Median length112
Mean length35.63831969
Min length5

Characters and Unicode

Total characters20819942
Distinct characters88
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1713 ?
Unique (%)0.3%

Sample

1st rowCarlia bicarinata (Macleay, 1877)
2nd rowPlethodon montanus Highton & Peabody, 2000
3rd rowEnhydris enhydris (Schneider, 1799)
4th rowGehyra mutilata (Wiegmann, 1834)
5th rowAnolis richardii Duméril & Bibron, 1837
ValueCountFrequency (%)
plethodon 168423
 
6.7%
green 95287
 
3.8%
1818 93378
 
3.7%
81423
 
3.2%
cinereus 75774
 
3.0%
desmognathus 35846
 
1.4%
cope 33117
 
1.3%
duméril 26833
 
1.1%
linnaeus 26096
 
1.0%
bibron 23820
 
0.9%
Other values (8722) 1859231
73.8%
2025-01-08T17:56:13.975634image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1935027
 
9.3%
e 1567127
 
7.5%
o 1187336
 
5.7%
n 1168671
 
5.6%
a 1154833
 
5.5%
i 1117013
 
5.4%
s 1062494
 
5.1%
r 1046115
 
5.0%
t 861966
 
4.1%
l 827376
 
4.0%
Other values (78) 8891984
42.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13925691
66.9%
Decimal Number 2300144
 
11.0%
Space Separator 1935027
 
9.3%
Uppercase Letter 1279500
 
6.1%
Other Punctuation 668876
 
3.2%
Open Punctuation 351966
 
1.7%
Close Punctuation 351966
 
1.7%
Dash Punctuation 6772
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1567127
11.3%
o 1187336
 
8.5%
n 1168671
 
8.4%
a 1154833
 
8.3%
i 1117013
 
8.0%
s 1062494
 
7.6%
r 1046115
 
7.5%
t 861966
 
6.2%
l 827376
 
5.9%
u 775008
 
5.6%
Other values (32) 3157752
22.7%
Uppercase Letter
ValueCountFrequency (%)
P 235098
18.4%
G 164331
12.8%
D 110463
8.6%
B 109196
8.5%
L 97183
7.6%
S 85301
 
6.7%
H 85115
 
6.7%
C 81625
 
6.4%
A 65870
 
5.1%
E 36637
 
2.9%
Other values (18) 208681
16.3%
Decimal Number
ValueCountFrequency (%)
1 722241
31.4%
8 557749
24.2%
9 226895
 
9.9%
2 147848
 
6.4%
0 133680
 
5.8%
5 118630
 
5.2%
7 113116
 
4.9%
6 102351
 
4.4%
3 97005
 
4.2%
4 80629
 
3.5%
Other Punctuation
ValueCountFrequency (%)
, 585483
87.5%
& 81423
 
12.2%
. 1066
 
0.2%
' 904
 
0.1%
Space Separator
ValueCountFrequency (%)
1935027
100.0%
Open Punctuation
ValueCountFrequency (%)
( 351966
100.0%
Close Punctuation
ValueCountFrequency (%)
) 351966
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6772
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15205191
73.0%
Common 5614751
 
27.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1567127
 
10.3%
o 1187336
 
7.8%
n 1168671
 
7.7%
a 1154833
 
7.6%
i 1117013
 
7.3%
s 1062494
 
7.0%
r 1046115
 
6.9%
t 861966
 
5.7%
l 827376
 
5.4%
u 775008
 
5.1%
Other values (60) 4437252
29.2%
Common
ValueCountFrequency (%)
1935027
34.5%
1 722241
 
12.9%
, 585483
 
10.4%
8 557749
 
9.9%
( 351966
 
6.3%
) 351966
 
6.3%
9 226895
 
4.0%
2 147848
 
2.6%
0 133680
 
2.4%
5 118630
 
2.1%
Other values (8) 483266
 
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20775558
99.8%
None 44384
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1935027
 
9.3%
e 1567127
 
7.5%
o 1187336
 
5.7%
n 1168671
 
5.6%
a 1154833
 
5.6%
i 1117013
 
5.4%
s 1062494
 
5.1%
r 1046115
 
5.0%
t 861966
 
4.1%
l 827376
 
4.0%
Other values (60) 8847600
42.6%
None
ValueCountFrequency (%)
é 29442
66.3%
ü 10886
 
24.5%
è 1680
 
3.8%
ö 1276
 
2.9%
Ö 294
 
0.7%
í 269
 
0.6%
ñ 249
 
0.6%
á 152
 
0.3%
ó 71
 
0.2%
å 20
 
< 0.1%
Other values (8) 45
 
0.1%
Distinct167
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-08T17:56:14.151500image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length86
Median length82
Mean length66.44007265
Min length10

Characters and Unicode

Total characters38814224
Distinct characters46
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia, Chordata, Vertebrata, Reptilia, Squamata, Sauria, Scincidae, Eugongylinae
2nd rowAnimalia, Chordata, Vertebrata, Amphibia, Caudata, Plethodontidae
3rd rowAnimalia, Chordata, Vertebrata, Reptilia, Squamata, Ophidia, Homalopsinae
4th rowAnimalia, Chordata, Vertebrata, Reptilia, Squamata, Sauria, Gekkoninae
5th rowAnimalia, Chordata, Vertebrata, Reptilia, Squamata, Sauria, Polychrotinae
ValueCountFrequency (%)
animalia 584195
15.7%
vertebrata 584195
15.7%
chordata 584178
15.7%
amphibia 395159
10.6%
caudata 237127
6.4%
plethodontidae 221369
 
5.9%
reptilia 189036
 
5.1%
squamata 169309
 
4.5%
anura 157511
 
4.2%
sauria 116154
 
3.1%
Other values (166) 484544
13.0%
2025-01-08T17:56:14.392197image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 6566805
16.9%
i 3313617
 
8.5%
, 3138578
 
8.1%
3138578
 
8.1%
t 3000106
 
7.7%
e 2360956
 
6.1%
r 2244920
 
5.8%
d 1648115
 
4.2%
h 1357195
 
3.5%
n 1355739
 
3.5%
Other values (36) 10689615
27.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 28814291
74.2%
Uppercase Letter 3722777
 
9.6%
Other Punctuation 3138578
 
8.1%
Space Separator 3138578
 
8.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6566805
22.8%
i 3313617
11.5%
t 3000106
10.4%
e 2360956
 
8.2%
r 2244920
 
7.8%
d 1648115
 
5.7%
h 1357195
 
4.7%
n 1355739
 
4.7%
o 1350378
 
4.7%
m 1224848
 
4.3%
Other values (14) 4391612
15.2%
Uppercase Letter
ValueCountFrequency (%)
A 1151930
30.9%
C 876519
23.5%
V 590792
15.9%
S 343033
 
9.2%
P 265039
 
7.1%
R 211930
 
5.7%
O 52750
 
1.4%
H 46430
 
1.2%
E 33840
 
0.9%
T 33424
 
0.9%
Other values (10) 117090
 
3.1%
Other Punctuation
ValueCountFrequency (%)
, 3138578
100.0%
Space Separator
ValueCountFrequency (%)
3138578
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 32537068
83.8%
Common 6277156
 
16.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6566805
20.2%
i 3313617
10.2%
t 3000106
 
9.2%
e 2360956
 
7.3%
r 2244920
 
6.9%
d 1648115
 
5.1%
h 1357195
 
4.2%
n 1355739
 
4.2%
o 1350378
 
4.2%
m 1224848
 
3.8%
Other values (34) 8114389
24.9%
Common
ValueCountFrequency (%)
, 3138578
50.0%
3138578
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 38814224
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 6566805
16.9%
i 3313617
 
8.5%
, 3138578
 
8.1%
3138578
 
8.1%
t 3000106
 
7.7%
e 2360956
 
6.1%
r 2244920
 
5.8%
d 1648115
 
4.2%
h 1357195
 
3.5%
n 1355739
 
3.5%
Other values (36) 10689615
27.5%

kingdom
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:14.441968image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters4673608
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAnimalia
2nd rowAnimalia
3rd rowAnimalia
4th rowAnimalia
5th rowAnimalia
ValueCountFrequency (%)
animalia 584201
100.0%
2025-01-08T17:56:14.538718image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 1168402
25.0%
a 1168402
25.0%
A 584201
12.5%
n 584201
12.5%
m 584201
12.5%
l 584201
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4089407
87.5%
Uppercase Letter 584201
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1168402
28.6%
a 1168402
28.6%
n 584201
14.3%
m 584201
14.3%
l 584201
14.3%
Uppercase Letter
ValueCountFrequency (%)
A 584201
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4673608
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1168402
25.0%
a 1168402
25.0%
A 584201
12.5%
n 584201
12.5%
m 584201
12.5%
l 584201
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4673608
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1168402
25.0%
a 1168402
25.0%
A 584201
12.5%
n 584201
12.5%
m 584201
12.5%
l 584201
12.5%

phylum
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing5
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-08T17:56:14.581718image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters4673568
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowChordata
2nd rowChordata
3rd rowChordata
4th rowChordata
5th rowChordata
ValueCountFrequency (%)
chordata 584196
100.0%
2025-01-08T17:56:14.671951image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1168392
25.0%
C 584196
12.5%
h 584196
12.5%
o 584196
12.5%
r 584196
12.5%
d 584196
12.5%
t 584196
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4089372
87.5%
Uppercase Letter 584196
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1168392
28.6%
h 584196
14.3%
o 584196
14.3%
r 584196
14.3%
d 584196
14.3%
t 584196
14.3%
Uppercase Letter
ValueCountFrequency (%)
C 584196
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4673568
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1168392
25.0%
C 584196
12.5%
h 584196
12.5%
o 584196
12.5%
r 584196
12.5%
d 584196
12.5%
t 584196
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4673568
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1168392
25.0%
C 584196
12.5%
h 584196
12.5%
o 584196
12.5%
r 584196
12.5%
d 584196
12.5%
t 584196
12.5%

class
Text

Distinct5
Distinct (%)< 0.1%
Missing203
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-08T17:56:14.716648image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length8
Mean length8.067606396
Min length8

Characters and Unicode

Total characters4711466
Distinct characters22
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSquamata
2nd rowAmphibia
3rd rowSquamata
4th rowSquamata
5th rowSquamata
ValueCountFrequency (%)
amphibia 395161
67.7%
squamata 169110
29.0%
testudines 18909
 
3.2%
crocodylia 804
 
0.1%
sphenodontia 14
 
< 0.1%
2025-01-08T17:56:14.820029image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 903309
19.2%
i 810049
17.2%
m 564271
12.0%
p 395175
8.4%
h 395175
8.4%
A 395161
8.4%
b 395161
8.4%
t 188033
 
4.0%
u 188019
 
4.0%
S 169124
 
3.6%
Other values (12) 307989
 
6.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4127468
87.6%
Uppercase Letter 583998
 
12.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 903309
21.9%
i 810049
19.6%
m 564271
13.7%
p 395175
9.6%
h 395175
9.6%
b 395161
9.6%
t 188033
 
4.6%
u 188019
 
4.6%
q 169110
 
4.1%
e 37832
 
0.9%
Other values (8) 81334
 
2.0%
Uppercase Letter
ValueCountFrequency (%)
A 395161
67.7%
S 169124
29.0%
T 18909
 
3.2%
C 804
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 4711466
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 903309
19.2%
i 810049
17.2%
m 564271
12.0%
p 395175
8.4%
h 395175
8.4%
A 395161
8.4%
b 395161
8.4%
t 188033
 
4.0%
u 188019
 
4.0%
S 169124
 
3.6%
Other values (12) 307989
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4711466
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 903309
19.2%
i 810049
17.2%
m 564271
12.0%
p 395175
8.4%
h 395175
8.4%
A 395161
8.4%
b 395161
8.4%
t 188033
 
4.0%
u 188019
 
4.0%
S 169124
 
3.6%
Other values (12) 307989
 
6.5%

order
Text

Missing 

Distinct3
Distinct (%)< 0.1%
Missing189040
Missing (%)32.4%
Memory size4.5 MiB
2025-01-08T17:56:14.862313image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length7
Mean length6.208074683
Min length5

Characters and Unicode

Total characters2453189
Distinct characters15
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCaudata
2nd rowCaudata
3rd rowAnura
4th rowAnura
5th rowCaudata
ValueCountFrequency (%)
caudata 237129
60.0%
anura 157511
39.9%
gymnophiona 521
 
0.1%
2025-01-08T17:56:14.958542image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 869419
35.4%
u 394640
16.1%
C 237129
 
9.7%
d 237129
 
9.7%
t 237129
 
9.7%
n 158553
 
6.5%
A 157511
 
6.4%
r 157511
 
6.4%
o 1042
 
< 0.1%
G 521
 
< 0.1%
Other values (5) 2605
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2058028
83.9%
Uppercase Letter 395161
 
16.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 869419
42.2%
u 394640
19.2%
d 237129
 
11.5%
t 237129
 
11.5%
n 158553
 
7.7%
r 157511
 
7.7%
o 1042
 
0.1%
y 521
 
< 0.1%
m 521
 
< 0.1%
p 521
 
< 0.1%
Other values (2) 1042
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
C 237129
60.0%
A 157511
39.9%
G 521
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 2453189
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 869419
35.4%
u 394640
16.1%
C 237129
 
9.7%
d 237129
 
9.7%
t 237129
 
9.7%
n 158553
 
6.5%
A 157511
 
6.4%
r 157511
 
6.4%
o 1042
 
< 0.1%
G 521
 
< 0.1%
Other values (5) 2605
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2453189
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 869419
35.4%
u 394640
16.1%
C 237129
 
9.7%
d 237129
 
9.7%
t 237129
 
9.7%
n 158553
 
6.5%
A 157511
 
6.4%
r 157511
 
6.4%
o 1042
 
< 0.1%
G 521
 
< 0.1%
Other values (5) 2605
 
0.1%

family
Text

Distinct159
Distinct (%)< 0.1%
Missing587
Missing (%)0.1%
Memory size4.5 MiB
2025-01-08T17:56:15.106088image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length19
Mean length12.00749468
Min length6

Characters and Unicode

Total characters7007742
Distinct characters42
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowScincidae
2nd rowPlethodontidae
3rd rowHomalopsidae
4th rowGekkonidae
5th rowDactyloidae
ValueCountFrequency (%)
plethodontidae 221371
37.9%
hylidae 41566
 
7.1%
colubridae 38793
 
6.6%
scincidae 26153
 
4.5%
bufonidae 25125
 
4.3%
ranidae 20333
 
3.5%
dactyloidae 18373
 
3.1%
gekkonidae 17255
 
3.0%
phrynosomatidae 16259
 
2.8%
leptodactylidae 10435
 
1.8%
Other values (149) 147951
25.4%
2025-01-08T17:56:15.330702image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 917602
13.1%
d 865810
12.4%
a 768920
11.0%
o 698920
10.0%
i 662199
9.4%
t 582781
8.3%
l 411564
 
5.9%
n 371028
 
5.3%
h 296650
 
4.2%
P 246994
 
3.5%
Other values (32) 1185274
16.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6424128
91.7%
Uppercase Letter 583614
 
8.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 917602
14.3%
d 865810
13.5%
a 768920
12.0%
o 698920
10.9%
i 662199
10.3%
t 582781
9.1%
l 411564
6.4%
n 371028
5.8%
h 296650
 
4.6%
r 160765
 
2.5%
Other values (12) 687889
10.7%
Uppercase Letter
ValueCountFrequency (%)
P 246994
42.3%
C 61224
 
10.5%
H 46641
 
8.0%
S 42563
 
7.3%
D 32211
 
5.5%
B 27193
 
4.7%
R 22885
 
3.9%
E 20369
 
3.5%
G 20274
 
3.5%
A 16289
 
2.8%
Other values (10) 46971
 
8.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7007742
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 917602
13.1%
d 865810
12.4%
a 768920
11.0%
o 698920
10.0%
i 662199
9.4%
t 582781
8.3%
l 411564
 
5.9%
n 371028
 
5.3%
h 296650
 
4.2%
P 246994
 
3.5%
Other values (32) 1185274
16.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7007742
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 917602
13.1%
d 865810
12.4%
a 768920
11.0%
o 698920
10.0%
i 662199
9.4%
t 582781
8.3%
l 411564
 
5.9%
n 371028
 
5.3%
h 296650
 
4.2%
P 246994
 
3.5%
Other values (32) 1185274
16.9%

genus
Text

Distinct1416
Distinct (%)0.2%
Missing1685
Missing (%)0.3%
Memory size4.5 MiB
2025-01-08T17:56:15.518144image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length16
Mean length9.556288583
Min length3

Characters and Unicode

Total characters5566691
Distinct characters52
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique97 ?
Unique (%)< 0.1%

Sample

1st rowCarlia
2nd rowPlethodon
3rd rowEnhydris
4th rowGehyra
5th rowAnolis
ValueCountFrequency (%)
plethodon 168423
28.9%
desmognathus 35846
 
6.2%
anolis 18373
 
3.2%
lithobates 12991
 
2.2%
eleutherodactylus 9948
 
1.7%
anaxyrus 9476
 
1.6%
sceloporus 8824
 
1.5%
emoia 8233
 
1.4%
eurycea 7667
 
1.3%
pseudacris 6800
 
1.2%
Other values (1406) 295935
50.8%
2025-01-08T17:56:15.761969image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 686047
12.3%
e 456061
 
8.2%
t 411539
 
7.4%
s 401007
 
7.2%
l 372079
 
6.7%
h 362493
 
6.5%
a 356491
 
6.4%
n 349007
 
6.3%
d 268142
 
4.8%
i 230506
 
4.1%
Other values (42) 1673319
30.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4984175
89.5%
Uppercase Letter 582516
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 686047
13.8%
e 456061
9.2%
t 411539
 
8.3%
s 401007
 
8.0%
l 372079
 
7.5%
h 362493
 
7.3%
a 356491
 
7.2%
n 349007
 
7.0%
d 268142
 
5.4%
i 230506
 
4.6%
Other values (16) 1090803
21.9%
Uppercase Letter
ValueCountFrequency (%)
P 209915
36.0%
A 57202
 
9.8%
D 53448
 
9.2%
L 37037
 
6.4%
S 34448
 
5.9%
E 33842
 
5.8%
C 32105
 
5.5%
H 19213
 
3.3%
T 18000
 
3.1%
B 13873
 
2.4%
Other values (16) 73433
 
12.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 5566691
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 686047
12.3%
e 456061
 
8.2%
t 411539
 
7.4%
s 401007
 
7.2%
l 372079
 
6.7%
h 362493
 
6.5%
a 356491
 
6.4%
n 349007
 
6.3%
d 268142
 
4.8%
i 230506
 
4.1%
Other values (42) 1673319
30.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5566691
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 686047
12.3%
e 456061
 
8.2%
t 411539
 
7.4%
s 401007
 
7.2%
l 372079
 
6.7%
h 362493
 
6.5%
a 356491
 
6.4%
n 349007
 
6.3%
d 268142
 
4.8%
i 230506
 
4.1%
Other values (42) 1673319
30.1%
Distinct1357
Distinct (%)0.2%
Missing1685
Missing (%)0.3%
Memory size4.5 MiB
2025-01-08T17:56:15.948221image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length16
Mean length9.513697478
Min length3

Characters and Unicode

Total characters5541881
Distinct characters51
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique124 ?
Unique (%)< 0.1%

Sample

1st rowCarlia
2nd rowPlethodon
3rd rowEnhydris
4th rowGehyra
5th rowAnolis
ValueCountFrequency (%)
plethodon 168423
28.9%
desmognathus 35846
 
6.2%
anolis 18333
 
3.1%
lithobates 12991
 
2.2%
eleutherodactylus 9947
 
1.7%
anaxyrus 9476
 
1.6%
sceloporus 8824
 
1.5%
emoia 8211
 
1.4%
eurycea 7626
 
1.3%
pseudacris 6800
 
1.2%
Other values (1347) 296039
50.8%
2025-01-08T17:56:16.196527image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 674207
12.2%
e 453829
 
8.2%
t 411395
 
7.4%
s 399277
 
7.2%
l 371549
 
6.7%
a 364650
 
6.6%
h 357431
 
6.4%
n 345478
 
6.2%
d 268371
 
4.8%
i 236254
 
4.3%
Other values (41) 1659440
29.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4959365
89.5%
Uppercase Letter 582516
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 674207
13.6%
e 453829
9.2%
t 411395
 
8.3%
s 399277
 
8.1%
l 371549
 
7.5%
a 364650
 
7.4%
h 357431
 
7.2%
n 345478
 
7.0%
d 268371
 
5.4%
i 236254
 
4.8%
Other values (16) 1076924
21.7%
Uppercase Letter
ValueCountFrequency (%)
P 210243
36.1%
A 59414
 
10.2%
D 48886
 
8.4%
L 39047
 
6.7%
E 33655
 
5.8%
S 33086
 
5.7%
C 32219
 
5.5%
H 26221
 
4.5%
T 17245
 
3.0%
R 13689
 
2.3%
Other values (15) 68811
 
11.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 5541881
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 674207
12.2%
e 453829
 
8.2%
t 411395
 
7.4%
s 399277
 
7.2%
l 371549
 
6.7%
a 364650
 
6.6%
h 357431
 
6.4%
n 345478
 
6.2%
d 268371
 
4.8%
i 236254
 
4.3%
Other values (41) 1659440
29.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5541881
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 674207
12.2%
e 453829
 
8.2%
t 411395
 
7.4%
s 399277
 
7.2%
l 371549
 
6.7%
a 364650
 
6.6%
h 357431
 
6.4%
n 345478
 
6.2%
d 268371
 
4.8%
i 236254
 
4.3%
Other values (41) 1659440
29.9%

specificEpithet
Text

Missing 

Distinct5069
Distinct (%)0.9%
Missing15011
Missing (%)2.6%
Memory size4.5 MiB
2025-01-08T17:56:16.351919image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length16
Mean length8.818503487
Min length3

Characters and Unicode

Total characters5019404
Distinct characters27
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique752 ?
Unique (%)0.1%

Sample

1st rowbicarinata
2nd rowmontanus
3rd rowenhydris
4th rowmutilata
5th rowrichardii
ValueCountFrequency (%)
cinereus 75774
 
13.3%
glutinosus 13098
 
2.3%
fuscus 10996
 
1.9%
montanus 10396
 
1.8%
jordani 8582
 
1.5%
metcalfi 6940
 
1.2%
cylindraceus 6103
 
1.1%
carolinensis 5850
 
1.0%
teyahalee 5559
 
1.0%
septentrionalis 4872
 
0.9%
Other values (5059) 421020
74.0%
2025-01-08T17:56:16.575122image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 543995
10.8%
s 515115
10.3%
e 488200
9.7%
a 483269
9.6%
r 401688
8.0%
u 396785
7.9%
n 359600
 
7.2%
c 306019
 
6.1%
t 278310
 
5.5%
o 259596
 
5.2%
Other values (17) 986827
19.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5018843
> 99.9%
Dash Punctuation 561
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 543995
10.8%
s 515115
10.3%
e 488200
9.7%
a 483269
9.6%
r 401688
8.0%
u 396785
7.9%
n 359600
 
7.2%
c 306019
 
6.1%
t 278310
 
5.5%
o 259596
 
5.2%
Other values (16) 986266
19.7%
Dash Punctuation
ValueCountFrequency (%)
- 561
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5018843
> 99.9%
Common 561
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 543995
10.8%
s 515115
10.3%
e 488200
9.7%
a 483269
9.6%
r 401688
8.0%
u 396785
7.9%
n 359600
 
7.2%
c 306019
 
6.1%
t 278310
 
5.5%
o 259596
 
5.2%
Other values (16) 986266
19.7%
Common
ValueCountFrequency (%)
- 561
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5019404
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 543995
10.8%
s 515115
10.3%
e 488200
9.7%
a 483269
9.6%
r 401688
8.0%
u 396785
7.9%
n 359600
 
7.2%
c 306019
 
6.1%
t 278310
 
5.5%
o 259596
 
5.2%
Other values (17) 986827
19.7%

infraspecificEpithet
Text

Missing 

Distinct1214
Distinct (%)4.9%
Missing559230
Missing (%)95.7%
Memory size4.5 MiB
2025-01-08T17:56:16.767141image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length14
Mean length9.070041248
Min length3

Characters and Unicode

Total characters226488
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique244 ?
Unique (%)1.0%

Sample

1st rowoccidentalis
2nd rowconsobrinus
3rd rowtrinidadensis
4th rowignigularis
5th rowmetcalfi
ValueCountFrequency (%)
viridescens 1460
 
5.8%
blanchardi 1205
 
4.8%
metcalfi 1072
 
4.3%
fasciata 1043
 
4.2%
elegans 909
 
3.6%
stejnegeri 388
 
1.6%
teyahalee 370
 
1.5%
louisianensis 365
 
1.5%
dorsalis 340
 
1.4%
fuscus 318
 
1.3%
Other values (1204) 17501
70.1%
2025-01-08T17:56:17.018448image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 26514
11.7%
i 26373
11.6%
s 22393
9.9%
e 19932
 
8.8%
n 15025
 
6.6%
r 14301
 
6.3%
l 14245
 
6.3%
t 12449
 
5.5%
c 12257
 
5.4%
u 11424
 
5.0%
Other values (16) 51575
22.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 226488
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 26514
11.7%
i 26373
11.6%
s 22393
9.9%
e 19932
 
8.8%
n 15025
 
6.6%
r 14301
 
6.3%
l 14245
 
6.3%
t 12449
 
5.5%
c 12257
 
5.4%
u 11424
 
5.0%
Other values (16) 51575
22.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 226488
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 26514
11.7%
i 26373
11.6%
s 22393
9.9%
e 19932
 
8.8%
n 15025
 
6.6%
r 14301
 
6.3%
l 14245
 
6.3%
t 12449
 
5.5%
c 12257
 
5.4%
u 11424
 
5.0%
Other values (16) 51575
22.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 226488
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 26514
11.7%
i 26373
11.6%
s 22393
9.9%
e 19932
 
8.8%
n 15025
 
6.6%
r 14301
 
6.3%
l 14245
 
6.3%
t 12449
 
5.5%
c 12257
 
5.4%
u 11424
 
5.0%
Other values (16) 51575
22.8%
Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:17.076091image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length7
Mean length7.079066965
Min length5

Characters and Unicode

Total characters4135598
Distinct characters21
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowSPECIES
2nd rowSPECIES
3rd rowSPECIES
4th rowSPECIES
5th rowSPECIES
ValueCountFrequency (%)
species 544219
93.2%
subspecies 24970
 
4.3%
genus 13326
 
2.3%
family 1101
 
0.2%
order 379
 
0.1%
phylum 198
 
< 0.1%
class 5
 
< 0.1%
kingdom 2
 
< 0.1%
variety 1
 
< 0.1%
2025-01-08T17:56:17.174016image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 1176684
28.5%
E 1152084
27.9%
I 570293
13.8%
P 569387
13.8%
C 569194
13.8%
U 38494
 
0.9%
B 24970
 
0.6%
G 13328
 
0.3%
N 13328
 
0.3%
L 1304
 
< 0.1%
Other values (11) 6532
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4135598
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 1176684
28.5%
E 1152084
27.9%
I 570293
13.8%
P 569387
13.8%
C 569194
13.8%
U 38494
 
0.9%
B 24970
 
0.6%
G 13328
 
0.3%
N 13328
 
0.3%
L 1304
 
< 0.1%
Other values (11) 6532
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 4135598
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 1176684
28.5%
E 1152084
27.9%
I 570293
13.8%
P 569387
13.8%
C 569194
13.8%
U 38494
 
0.9%
B 24970
 
0.6%
G 13328
 
0.3%
N 13328
 
0.3%
L 1304
 
< 0.1%
Other values (11) 6532
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4135598
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 1176684
28.5%
E 1152084
27.9%
I 570293
13.8%
P 569387
13.8%
C 569194
13.8%
U 38494
 
0.9%
B 24970
 
0.6%
G 13328
 
0.3%
N 13328
 
0.3%
L 1304
 
< 0.1%
Other values (11) 6532
 
0.2%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:17.218444image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.914724555
Min length7

Characters and Unicode

Total characters4623790
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowACCEPTED
2nd rowACCEPTED
3rd rowACCEPTED
4th rowACCEPTED
5th rowACCEPTED
ValueCountFrequency (%)
accepted 534360
91.5%
synonym 49818
 
8.5%
doubtful 23
 
< 0.1%
2025-01-08T17:56:17.314806image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 1068720
23.1%
E 1068720
23.1%
T 534383
11.6%
D 534383
11.6%
A 534360
11.6%
P 534360
11.6%
Y 99636
 
2.2%
N 99636
 
2.2%
O 49841
 
1.1%
S 49818
 
1.1%
Other values (5) 49933
 
1.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4623790
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 1068720
23.1%
E 1068720
23.1%
T 534383
11.6%
D 534383
11.6%
A 534360
11.6%
P 534360
11.6%
Y 99636
 
2.2%
N 99636
 
2.2%
O 49841
 
1.1%
S 49818
 
1.1%
Other values (5) 49933
 
1.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 4623790
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 1068720
23.1%
E 1068720
23.1%
T 534383
11.6%
D 534383
11.6%
A 534360
11.6%
P 534360
11.6%
Y 99636
 
2.2%
N 99636
 
2.2%
O 49841
 
1.1%
S 49818
 
1.1%
Other values (5) 49933
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4623790
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 1068720
23.1%
E 1068720
23.1%
T 534383
11.6%
D 534383
11.6%
A 534360
11.6%
P 534360
11.6%
Y 99636
 
2.2%
N 99636
 
2.2%
O 49841
 
1.1%
S 49818
 
1.1%
Other values (5) 49933
 
1.1%

datasetKey
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:17.367362image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters21031236
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row821cc27a-e3bb-4bc5-ac34-89ada245069d
2nd row821cc27a-e3bb-4bc5-ac34-89ada245069d
3rd row821cc27a-e3bb-4bc5-ac34-89ada245069d
4th row821cc27a-e3bb-4bc5-ac34-89ada245069d
5th row821cc27a-e3bb-4bc5-ac34-89ada245069d
ValueCountFrequency (%)
821cc27a-e3bb-4bc5-ac34-89ada245069d 584201
100.0%
2025-01-08T17:56:17.468752image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 2336804
11.1%
a 2336804
11.1%
- 2336804
11.1%
2 1752603
8.3%
b 1752603
8.3%
4 1752603
8.3%
8 1168402
 
5.6%
3 1168402
 
5.6%
5 1168402
 
5.6%
9 1168402
 
5.6%
Other values (6) 4089407
19.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10515618
50.0%
Lowercase Letter 8178814
38.9%
Dash Punctuation 2336804
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 1752603
16.7%
4 1752603
16.7%
8 1168402
11.1%
3 1168402
11.1%
5 1168402
11.1%
9 1168402
11.1%
1 584201
 
5.6%
7 584201
 
5.6%
0 584201
 
5.6%
6 584201
 
5.6%
Lowercase Letter
ValueCountFrequency (%)
c 2336804
28.6%
a 2336804
28.6%
b 1752603
21.4%
d 1168402
14.3%
e 584201
 
7.1%
Dash Punctuation
ValueCountFrequency (%)
- 2336804
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12852422
61.1%
Latin 8178814
38.9%

Most frequent character per script

Common
ValueCountFrequency (%)
- 2336804
18.2%
2 1752603
13.6%
4 1752603
13.6%
8 1168402
9.1%
3 1168402
9.1%
5 1168402
9.1%
9 1168402
9.1%
1 584201
 
4.5%
7 584201
 
4.5%
0 584201
 
4.5%
Latin
ValueCountFrequency (%)
c 2336804
28.6%
a 2336804
28.6%
b 1752603
21.4%
d 1168402
14.3%
e 584201
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21031236
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 2336804
11.1%
a 2336804
11.1%
- 2336804
11.1%
2 1752603
8.3%
b 1752603
8.3%
4 1752603
8.3%
8 1168402
 
5.6%
3 1168402
 
5.6%
5 1168402
 
5.6%
9 1168402
 
5.6%
Other values (6) 4089407
19.4%

publishingCountry
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:17.508751image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1168402
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUS
2nd rowUS
3rd rowUS
4th rowUS
5th rowUS
ValueCountFrequency (%)
us 584201
100.0%
2025-01-08T17:56:17.601022image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 584201
50.0%
S 584201
50.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1168402
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 584201
50.0%
S 584201
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1168402
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 584201
50.0%
S 584201
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1168402
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 584201
50.0%
S 584201
50.0%
Distinct186736
Distinct (%)32.0%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:17.746775image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99567957
Min length20

Characters and Unicode

Total characters14018300
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique40383 ?
Unique (%)6.9%

Sample

1st row2024-12-02T13:56:06.739Z
2nd row2024-12-02T13:56:08.224Z
3rd row2024-12-02T13:55:56.801Z
4th row2024-12-02T13:59:51.499Z
5th row2024-12-02T13:58:04.592Z
ValueCountFrequency (%)
2024-12-02t13:57:45.601z 17
 
< 0.1%
2024-12-02t13:57:52.847z 16
 
< 0.1%
2024-12-02t13:57:54.221z 16
 
< 0.1%
2024-12-02t13:57:23.249z 16
 
< 0.1%
2024-12-02t13:57:51.135z 16
 
< 0.1%
2024-12-02t13:57:50.745z 15
 
< 0.1%
2024-12-02t13:58:01.663z 15
 
< 0.1%
2024-12-02t13:56:52.538z 15
 
< 0.1%
2024-12-02t13:57:30.398z 15
 
< 0.1%
2024-12-02t13:57:53.169z 15
 
< 0.1%
Other values (186726) 584045
> 99.9%
2025-01-08T17:56:17.964526image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2668002
19.0%
0 1480784
10.6%
1 1472907
10.5%
- 1168402
8.3%
: 1168402
8.3%
4 939301
 
6.7%
5 927875
 
6.6%
3 926225
 
6.6%
T 584201
 
4.2%
Z 584201
 
4.2%
Other values (5) 2098000
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9929524
70.8%
Other Punctuation 1751972
 
12.5%
Dash Punctuation 1168402
 
8.3%
Uppercase Letter 1168402
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2668002
26.9%
0 1480784
14.9%
1 1472907
14.8%
4 939301
 
9.5%
5 927875
 
9.3%
3 926225
 
9.3%
7 448284
 
4.5%
9 373157
 
3.8%
6 352898
 
3.6%
8 340091
 
3.4%
Other Punctuation
ValueCountFrequency (%)
: 1168402
66.7%
. 583570
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 584201
50.0%
Z 584201
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1168402
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12849898
91.7%
Latin 1168402
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2668002
20.8%
0 1480784
11.5%
1 1472907
11.5%
- 1168402
9.1%
: 1168402
9.1%
4 939301
 
7.3%
5 927875
 
7.2%
3 926225
 
7.2%
. 583570
 
4.5%
7 448284
 
3.5%
Other values (3) 1066146
 
8.3%
Latin
ValueCountFrequency (%)
T 584201
50.0%
Z 584201
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14018300
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2668002
19.0%
0 1480784
10.6%
1 1472907
10.5%
- 1168402
8.3%
: 1168402
8.3%
4 939301
 
6.7%
5 927875
 
6.6%
3 926225
 
6.6%
T 584201
 
4.2%
Z 584201
 
4.2%
Other values (5) 2098000
15.0%

elevation
Text

Missing 

Distinct1604
Distinct (%)0.6%
Missing332110
Missing (%)56.8%
Memory size4.5 MiB
2025-01-08T17:56:18.159504image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length5
Mean length5.180430876
Min length3

Characters and Unicode

Total characters1305940
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique190 ?
Unique (%)0.1%

Sample

1st row1317.0
2nd row1326.0
3rd row2200.0
4th row40.0
5th row9.0
ValueCountFrequency (%)
1067.0 4286
 
1.7%
373.0 4059
 
1.6%
1036.0 2829
 
1.1%
200.0 2818
 
1.1%
3.0 2315
 
0.9%
280.0 2242
 
0.9%
6.0 2149
 
0.9%
174.0 2077
 
0.8%
1146.0 2023
 
0.8%
152.0 2023
 
0.8%
Other values (1591) 225270
89.4%
2025-01-08T17:56:18.414289image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 353001
27.0%
. 252091
19.3%
1 176370
13.5%
2 80512
 
6.2%
3 77490
 
5.9%
5 76790
 
5.9%
4 63616
 
4.9%
7 61821
 
4.7%
6 60632
 
4.6%
9 53647
 
4.1%
Other values (2) 49970
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1053844
80.7%
Other Punctuation 252091
 
19.3%
Dash Punctuation 5
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 353001
33.5%
1 176370
16.7%
2 80512
 
7.6%
3 77490
 
7.4%
5 76790
 
7.3%
4 63616
 
6.0%
7 61821
 
5.9%
6 60632
 
5.8%
9 53647
 
5.1%
8 49965
 
4.7%
Other Punctuation
ValueCountFrequency (%)
. 252091
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1305940
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 353001
27.0%
. 252091
19.3%
1 176370
13.5%
2 80512
 
6.2%
3 77490
 
5.9%
5 76790
 
5.9%
4 63616
 
4.9%
7 61821
 
4.7%
6 60632
 
4.6%
9 53647
 
4.1%
Other values (2) 49970
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1305940
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 353001
27.0%
. 252091
19.3%
1 176370
13.5%
2 80512
 
6.2%
3 77490
 
5.9%
5 76790
 
5.9%
4 63616
 
4.9%
7 61821
 
4.7%
6 60632
 
4.6%
9 53647
 
4.1%
Other values (2) 49970
 
3.8%

elevationAccuracy
Text

Missing 

Distinct136
Distinct (%)0.1%
Missing333288
Missing (%)57.1%
Memory size4.5 MiB
2025-01-08T17:56:18.543935image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length3
Mean length3.118419532
Min length3

Characters and Unicode

Total characters782452
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row10.0
5th row0.0
ValueCountFrequency (%)
0.0 220652
87.9%
38.0 4866
 
1.9%
30.5 2706
 
1.1%
15.0 2411
 
1.0%
18.0 1878
 
0.7%
20.0 1562
 
0.6%
15.5 1561
 
0.6%
61.0 1329
 
0.5%
26.0 989
 
0.4%
12.0 842
 
0.3%
Other values (126) 12117
 
4.8%
2025-01-08T17:56:18.722818image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 469334
60.0%
. 250913
32.1%
5 16435
 
2.1%
1 11282
 
1.4%
3 10235
 
1.3%
8 7880
 
1.0%
2 7575
 
1.0%
6 3460
 
0.4%
4 2449
 
0.3%
7 1564
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 531539
67.9%
Other Punctuation 250913
32.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 469334
88.3%
5 16435
 
3.1%
1 11282
 
2.1%
3 10235
 
1.9%
8 7880
 
1.5%
2 7575
 
1.4%
6 3460
 
0.7%
4 2449
 
0.5%
7 1564
 
0.3%
9 1325
 
0.2%
Other Punctuation
ValueCountFrequency (%)
. 250913
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 782452
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 469334
60.0%
. 250913
32.1%
5 16435
 
2.1%
1 11282
 
1.4%
3 10235
 
1.3%
8 7880
 
1.0%
2 7575
 
1.0%
6 3460
 
0.4%
4 2449
 
0.3%
7 1564
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 782452
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 469334
60.0%
. 250913
32.1%
5 16435
 
2.1%
1 11282
 
1.4%
3 10235
 
1.3%
8 7880
 
1.0%
2 7575
 
1.0%
6 3460
 
0.4%
4 2449
 
0.3%
7 1564
 
0.2%
Distinct146
Distinct (%)5.9%
Missing581727
Missing (%)99.6%
Memory size4.5 MiB
2025-01-08T17:56:18.839712image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length17
Mean length17.05052546
Min length3

Characters and Unicode

Total characters42183
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique48 ?
Unique (%)1.9%

Sample

1st row818.1211019658687
2nd row4856.291022878801
3rd row1710.7413076448918
4th row3977.2558796326234
5th row4961.494346970892
ValueCountFrequency (%)
2063.191632254214 334
 
13.5%
4961.494346970892 245
 
9.9%
1710.7413076448918 132
 
5.3%
4852.601362825603 128
 
5.2%
818.1211019658687 83
 
3.4%
4878.72894658956 83
 
3.4%
2259.882955420656 80
 
3.2%
1971.0139476565842 69
 
2.8%
3977.2558796326234 55
 
2.2%
4128.7637113665405 53
 
2.1%
Other values (136) 1212
49.0%
2025-01-08T17:56:19.023135image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 5219
12.4%
2 4705
11.2%
1 4367
10.4%
6 4167
9.9%
3 3976
9.4%
8 3849
9.1%
9 3792
9.0%
5 3395
8.0%
7 3367
8.0%
0 2872
6.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 39709
94.1%
Other Punctuation 2474
 
5.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 5219
13.1%
2 4705
11.8%
1 4367
11.0%
6 4167
10.5%
3 3976
10.0%
8 3849
9.7%
9 3792
9.5%
5 3395
8.5%
7 3367
8.5%
0 2872
7.2%
Other Punctuation
ValueCountFrequency (%)
. 2474
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 42183
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 5219
12.4%
2 4705
11.2%
1 4367
10.4%
6 4167
9.9%
3 3976
9.4%
8 3849
9.1%
9 3792
9.0%
5 3395
8.0%
7 3367
8.0%
0 2872
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 42183
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 5219
12.4%
2 4705
11.2%
1 4367
10.4%
6 4167
9.9%
3 3976
9.4%
8 3849
9.1%
9 3792
9.0%
5 3395
8.0%
7 3367
8.0%
0 2872
6.8%

issue
Text

Distinct165
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-08T17:56:19.216980image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length186
Median length179
Mean length68.60599385
Min length28

Characters and Unicode

Total characters40079553
Distinct characters29
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)< 0.1%

Sample

1st rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84
2nd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84
3rd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT
4th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;CONTINENT_DERIVED_FROM_COUNTRY;CONTINENT_INVALID
5th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;GEODETIC_DATUM_ASSUMED_WGS84
ValueCountFrequency (%)
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84 249666
42.7%
occurrence_status_inferred_from_individual_count 227050
38.9%
occurrence_status_inferred_from_individual_count;coordinate_reprojected 34842
 
6.0%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;geodetic_datum_invalid 12098
 
2.1%
occurrence_status_inferred_from_individual_count;continent_derived_from_country;continent_invalid 9004
 
1.5%
occurrence_status_inferred_from_individual_count;country_derived_from_coordinates;country_invalid;geodetic_datum_assumed_wgs84;continent_derived_from_coordinates;continent_invalid 6887
 
1.2%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;continent_derived_from_coordinates;continent_invalid 6809
 
1.2%
occurrence_status_inferred_from_individual_count;taxon_match_higherrank 5451
 
0.9%
occurrence_status_inferred_from_individual_count;country_invalid 5419
 
0.9%
occurrence_status_inferred_from_individual_count;geodetic_datum_assumed_wgs84;taxon_match_higherrank 4662
 
0.8%
Other values (155) 22311
 
3.8%
2025-01-08T17:56:19.351388image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 4053596
10.1%
E 3553564
 
8.9%
R 3202871
 
8.0%
U 2974631
 
7.4%
I 2922553
 
7.3%
D 2881308
 
7.2%
C 2865419
 
7.1%
N 2681132
 
6.7%
T 2652470
 
6.6%
O 2377822
 
5.9%
Other values (19) 9914187
24.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 35006044
87.3%
Connector Punctuation 4053596
 
10.1%
Decimal Number 577858
 
1.4%
Other Punctuation 442055
 
1.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 3553564
10.2%
R 3202871
9.1%
U 2974631
8.5%
I 2922553
8.3%
D 2881308
8.2%
C 2865419
8.2%
N 2681132
 
7.7%
T 2652470
 
7.6%
O 2377822
 
6.8%
S 2077060
 
5.9%
Other values (15) 6817214
19.5%
Decimal Number
ValueCountFrequency (%)
8 288929
50.0%
4 288929
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4053596
100.0%
Other Punctuation
ValueCountFrequency (%)
; 442055
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 35006044
87.3%
Common 5073509
 
12.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 3553564
10.2%
R 3202871
9.1%
U 2974631
8.5%
I 2922553
8.3%
D 2881308
8.2%
C 2865419
8.2%
N 2681132
 
7.7%
T 2652470
 
7.6%
O 2377822
 
6.8%
S 2077060
 
5.9%
Other values (15) 6817214
19.5%
Common
ValueCountFrequency (%)
_ 4053596
79.9%
; 442055
 
8.7%
8 288929
 
5.7%
4 288929
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 40079553
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 4053596
10.1%
E 3553564
 
8.9%
R 3202871
 
8.0%
U 2974631
 
7.4%
I 2922553
 
7.3%
D 2881308
 
7.2%
C 2865419
 
7.1%
N 2681132
 
6.7%
T 2652470
 
6.6%
O 2377822
 
5.9%
Other values (19) 9914187
24.7%

mediaType
Text

Missing 

Distinct23
Distinct (%)0.4%
Missing579082
Missing (%)99.1%
Memory size4.5 MiB
2025-01-08T17:56:19.408389image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length285
Median length274
Mean length32.18480172
Min length10

Characters and Unicode

Total characters164754
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)0.1%

Sample

1st rowStillImage;StillImage;StillImage;StillImage;StillImage
2nd rowStillImage;StillImage
3rd rowStillImage;StillImage;StillImage
4th rowStillImage;StillImage;StillImage
5th rowStillImage;StillImage;StillImage;StillImage
ValueCountFrequency (%)
stillimage;stillimage 2352
45.9%
stillimage 841
 
16.4%
stillimage;stillimage;stillimage 690
 
13.5%
stillimage;stillimage;stillimage;stillimage 366
 
7.1%
stillimage;stillimage;stillimage;stillimage;stillimage 268
 
5.2%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 188
 
3.7%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 118
 
2.3%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 110
 
2.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 58
 
1.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 41
 
0.8%
Other values (13) 87
 
1.7%
2025-01-08T17:56:19.530612image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 30886
18.7%
S 15443
9.4%
t 15443
9.4%
i 15443
9.4%
I 15443
9.4%
m 15443
9.4%
a 15443
9.4%
g 15443
9.4%
e 15443
9.4%
; 10324
 
6.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 123544
75.0%
Uppercase Letter 30886
 
18.7%
Other Punctuation 10324
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 30886
25.0%
t 15443
12.5%
i 15443
12.5%
m 15443
12.5%
a 15443
12.5%
g 15443
12.5%
e 15443
12.5%
Uppercase Letter
ValueCountFrequency (%)
S 15443
50.0%
I 15443
50.0%
Other Punctuation
ValueCountFrequency (%)
; 10324
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 154430
93.7%
Common 10324
 
6.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 30886
20.0%
S 15443
10.0%
t 15443
10.0%
i 15443
10.0%
I 15443
10.0%
m 15443
10.0%
a 15443
10.0%
g 15443
10.0%
e 15443
10.0%
Common
ValueCountFrequency (%)
; 10324
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 164754
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 30886
18.7%
S 15443
9.4%
t 15443
9.4%
i 15443
9.4%
I 15443
9.4%
m 15443
9.4%
a 15443
9.4%
g 15443
9.4%
e 15443
9.4%
; 10324
 
6.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:19.575811image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4.278443549
Min length4

Characters and Unicode

Total characters2499471
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowtrue
2nd rowtrue
3rd rowfalse
4th rowfalse
5th rowtrue
ValueCountFrequency (%)
true 421534
72.2%
false 162667
 
27.8%
2025-01-08T17:56:19.666462image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 584201
23.4%
t 421534
16.9%
r 421534
16.9%
u 421534
16.9%
f 162667
 
6.5%
a 162667
 
6.5%
l 162667
 
6.5%
s 162667
 
6.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2499471
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 584201
23.4%
t 421534
16.9%
r 421534
16.9%
u 421534
16.9%
f 162667
 
6.5%
a 162667
 
6.5%
l 162667
 
6.5%
s 162667
 
6.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 2499471
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 584201
23.4%
t 421534
16.9%
r 421534
16.9%
u 421534
16.9%
f 162667
 
6.5%
a 162667
 
6.5%
l 162667
 
6.5%
s 162667
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2499471
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 584201
23.4%
t 421534
16.9%
r 421534
16.9%
u 421534
16.9%
f 162667
 
6.5%
a 162667
 
6.5%
l 162667
 
6.5%
s 162667
 
6.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:19.704002image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.996222191
Min length4

Characters and Unicode

Total characters2918798
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 581994
99.6%
true 2207
 
0.4%
2025-01-08T17:56:19.797078image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 584201
20.0%
f 581994
19.9%
a 581994
19.9%
l 581994
19.9%
s 581994
19.9%
t 2207
 
0.1%
r 2207
 
0.1%
u 2207
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2918798
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 584201
20.0%
f 581994
19.9%
a 581994
19.9%
l 581994
19.9%
s 581994
19.9%
t 2207
 
0.1%
r 2207
 
0.1%
u 2207
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 2918798
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 584201
20.0%
f 581994
19.9%
a 581994
19.9%
l 581994
19.9%
s 581994
19.9%
t 2207
 
0.1%
r 2207
 
0.1%
u 2207
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2918798
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 584201
20.0%
f 581994
19.9%
a 581994
19.9%
l 581994
19.9%
s 581994
19.9%
t 2207
 
0.1%
r 2207
 
0.1%
u 2207
 
0.1%
Distinct9012
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:19.980234image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.999032867
Min length1

Characters and Unicode

Total characters4088842
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1713 ?
Unique (%)0.3%

Sample

1st row5225055
2nd row2431506
3rd row5224383
4th row2446249
5th row2467415
ValueCountFrequency (%)
2431491 75714
 
13.0%
2431539 13092
 
2.2%
2431224 10137
 
1.7%
2431506 9986
 
1.7%
2431529 7074
 
1.2%
2431516 6940
 
1.2%
2431489 6103
 
1.0%
2431484 5559
 
1.0%
2431219 4681
 
0.8%
2431510 4614
 
0.8%
Other values (9002) 440301
75.4%
2025-01-08T17:56:20.236538image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 867830
21.2%
4 749774
18.3%
1 546778
13.4%
3 448678
11.0%
5 328014
 
8.0%
9 301296
 
7.4%
6 239069
 
5.8%
7 213079
 
5.2%
8 210506
 
5.1%
0 183818
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4088842
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 867830
21.2%
4 749774
18.3%
1 546778
13.4%
3 448678
11.0%
5 328014
 
8.0%
9 301296
 
7.4%
6 239069
 
5.8%
7 213079
 
5.2%
8 210506
 
5.1%
0 183818
 
4.5%

Most occurring scripts

ValueCountFrequency (%)
Common 4088842
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 867830
21.2%
4 749774
18.3%
1 546778
13.4%
3 448678
11.0%
5 328014
 
8.0%
9 301296
 
7.4%
6 239069
 
5.8%
7 213079
 
5.2%
8 210506
 
5.1%
0 183818
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4088842
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 867830
21.2%
4 749774
18.3%
1 546778
13.4%
3 448678
11.0%
5 328014
 
8.0%
9 301296
 
7.4%
6 239069
 
5.8%
7 213079
 
5.2%
8 210506
 
5.1%
0 183818
 
4.5%
Distinct8475
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:20.438791image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.019563472
Min length1

Characters and Unicode

Total characters4100836
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1520 ?
Unique (%)0.3%

Sample

1st row5225055
2nd row2431506
3rd row5224383
4th row2446249
5th row2467415
ValueCountFrequency (%)
2431491 75714
 
13.0%
2431539 13092
 
2.2%
2431224 10146
 
1.7%
2431506 9986
 
1.7%
2431516 8012
 
1.4%
2431529 7074
 
1.2%
2431489 6103
 
1.0%
2431484 5929
 
1.0%
2431219 4681
 
0.8%
2431510 4614
 
0.8%
Other values (8465) 438850
75.1%
2025-01-08T17:56:20.703874image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 850954
20.8%
4 741903
18.1%
1 558280
13.6%
3 457912
11.2%
5 330769
 
8.1%
9 302439
 
7.4%
6 235990
 
5.8%
8 216124
 
5.3%
7 207914
 
5.1%
0 198551
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4100836
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 850954
20.8%
4 741903
18.1%
1 558280
13.6%
3 457912
11.2%
5 330769
 
8.1%
9 302439
 
7.4%
6 235990
 
5.8%
8 216124
 
5.3%
7 207914
 
5.1%
0 198551
 
4.8%

Most occurring scripts

ValueCountFrequency (%)
Common 4100836
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 850954
20.8%
4 741903
18.1%
1 558280
13.6%
3 457912
11.2%
5 330769
 
8.1%
9 302439
 
7.4%
6 235990
 
5.8%
8 216124
 
5.3%
7 207914
 
5.1%
0 198551
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4100836
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 850954
20.8%
4 741903
18.1%
1 558280
13.6%
3 457912
11.2%
5 330769
 
8.1%
9 302439
 
7.4%
6 235990
 
5.8%
8 216124
 
5.3%
7 207914
 
5.1%
0 198551
 
4.8%

kingdomKey
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:20.758282image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters584201
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 584201
100.0%
2025-01-08T17:56:20.844587image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 584201
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 584201
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 584201
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 584201
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 584201
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 584201
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 584201
100.0%

phylumKey
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing5
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-08T17:56:20.883587image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1168392
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row44
2nd row44
3rd row44
4th row44
5th row44
ValueCountFrequency (%)
44 584196
100.0%
2025-01-08T17:56:20.973135image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 1168392
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1168392
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 1168392
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1168392
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 1168392
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1168392
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 1168392
100.0%
Distinct5
Distinct (%)< 0.1%
Missing203
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-08T17:56:21.015136image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length3
Mean length4.616760674
Min length3

Characters and Unicode

Total characters2696179
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row11592253
2nd row131
3rd row11592253
4th row11592253
5th row11592253
ValueCountFrequency (%)
131 395161
67.7%
11592253 169110
29.0%
11418114 18909
 
3.2%
11493978 804
 
0.1%
11569602 14
 
< 0.1%
2025-01-08T17:56:21.110299image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1224723
45.4%
3 565075
21.0%
5 338234
 
12.5%
2 338234
 
12.5%
9 170732
 
6.3%
4 38622
 
1.4%
8 19713
 
0.7%
7 804
 
< 0.1%
6 28
 
< 0.1%
0 14
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2696179
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1224723
45.4%
3 565075
21.0%
5 338234
 
12.5%
2 338234
 
12.5%
9 170732
 
6.3%
4 38622
 
1.4%
8 19713
 
0.7%
7 804
 
< 0.1%
6 28
 
< 0.1%
0 14
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 2696179
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1224723
45.4%
3 565075
21.0%
5 338234
 
12.5%
2 338234
 
12.5%
9 170732
 
6.3%
4 38622
 
1.4%
8 19713
 
0.7%
7 804
 
< 0.1%
6 28
 
< 0.1%
0 14
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2696179
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1224723
45.4%
3 565075
21.0%
5 338234
 
12.5%
2 338234
 
12.5%
9 170732
 
6.3%
4 38622
 
1.4%
8 19713
 
0.7%
7 804
 
< 0.1%
6 28
 
< 0.1%
0 14
 
< 0.1%

orderKey
Text

Missing 

Distinct3
Distinct (%)< 0.1%
Missing189040
Missing (%)32.4%
Memory size4.5 MiB
2025-01-08T17:56:21.151220image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1185483
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row953
2nd row953
3rd row952
4th row952
5th row953
ValueCountFrequency (%)
953 237129
60.0%
952 157511
39.9%
775 521
 
0.1%
2025-01-08T17:56:21.241171image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 395161
33.3%
9 394640
33.3%
3 237129
20.0%
2 157511
 
13.3%
7 1042
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1185483
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 395161
33.3%
9 394640
33.3%
3 237129
20.0%
2 157511
 
13.3%
7 1042
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 1185483
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 395161
33.3%
9 394640
33.3%
3 237129
20.0%
2 157511
 
13.3%
7 1042
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1185483
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 395161
33.3%
9 394640
33.3%
3 237129
20.0%
2 157511
 
13.3%
7 1042
 
0.1%
Distinct159
Distinct (%)< 0.1%
Missing587
Missing (%)0.1%
Memory size4.5 MiB
2025-01-08T17:56:21.378829image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length4
Mean length4.194695467
Min length4

Characters and Unicode

Total characters2448083
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row9115
2nd row6748
3rd row5789856
4th row5666
5th row8345926
ValueCountFrequency (%)
6748 221371
37.9%
6735 41566
 
7.1%
6172 38793
 
6.6%
9115 26153
 
4.5%
6727 25125
 
4.3%
6746 20333
 
3.5%
8345926 18373
 
3.1%
5666 17255
 
3.0%
5016 16259
 
2.8%
6739 10435
 
1.8%
Other values (149) 147951
25.4%
2025-01-08T17:56:21.590916image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 544722
22.3%
7 457502
18.7%
4 315203
12.9%
8 289994
11.8%
5 197308
 
8.1%
1 162565
 
6.6%
3 151911
 
6.2%
2 123166
 
5.0%
9 117217
 
4.8%
0 88495
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2448083
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 544722
22.3%
7 457502
18.7%
4 315203
12.9%
8 289994
11.8%
5 197308
 
8.1%
1 162565
 
6.6%
3 151911
 
6.2%
2 123166
 
5.0%
9 117217
 
4.8%
0 88495
 
3.6%

Most occurring scripts

ValueCountFrequency (%)
Common 2448083
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 544722
22.3%
7 457502
18.7%
4 315203
12.9%
8 289994
11.8%
5 197308
 
8.1%
1 162565
 
6.6%
3 151911
 
6.2%
2 123166
 
5.0%
9 117217
 
4.8%
0 88495
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2448083
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 544722
22.3%
7 457502
18.7%
4 315203
12.9%
8 289994
11.8%
5 197308
 
8.1%
1 162565
 
6.6%
3 151911
 
6.2%
2 123166
 
5.0%
9 117217
 
4.8%
0 88495
 
3.6%
Distinct1418
Distinct (%)0.2%
Missing1685
Missing (%)0.3%
Memory size4.5 MiB
2025-01-08T17:56:21.780468image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.007369755
Min length7

Characters and Unicode

Total characters4081905
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique98 ?
Unique (%)< 0.1%

Sample

1st row2461082
2nd row2431477
3rd row2449677
4th row2446193
5th row8782549
ValueCountFrequency (%)
2431477 168423
28.9%
2431198 35846
 
6.2%
8782549 18373
 
3.2%
2427046 12991
 
2.2%
2424035 9948
 
1.7%
2422857 9476
 
1.6%
2451143 8824
 
1.5%
2463307 8233
 
1.4%
5218343 7667
 
1.3%
2428124 6800
 
1.2%
Other values (1408) 295935
50.8%
2025-01-08T17:56:22.025263image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 924908
22.7%
2 838523
20.5%
7 519504
12.7%
3 429420
10.5%
1 409884
10.0%
8 234519
 
5.7%
5 214552
 
5.3%
9 184654
 
4.5%
6 179151
 
4.4%
0 146790
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4081905
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 924908
22.7%
2 838523
20.5%
7 519504
12.7%
3 429420
10.5%
1 409884
10.0%
8 234519
 
5.7%
5 214552
 
5.3%
9 184654
 
4.5%
6 179151
 
4.4%
0 146790
 
3.6%

Most occurring scripts

ValueCountFrequency (%)
Common 4081905
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 924908
22.7%
2 838523
20.5%
7 519504
12.7%
3 429420
10.5%
1 409884
10.0%
8 234519
 
5.7%
5 214552
 
5.3%
9 184654
 
4.5%
6 179151
 
4.4%
0 146790
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4081905
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 924908
22.7%
2 838523
20.5%
7 519504
12.7%
3 429420
10.5%
1 409884
10.0%
8 234519
 
5.7%
5 214552
 
5.3%
9 184654
 
4.5%
6 179151
 
4.4%
0 146790
 
3.6%

speciesKey
Text

Missing 

Distinct7286
Distinct (%)1.3%
Missing15011
Missing (%)2.6%
Memory size4.5 MiB
2025-01-08T17:56:22.227690image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.029765105
Min length7

Characters and Unicode

Total characters4001272
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1233 ?
Unique (%)0.2%

Sample

1st row5225055
2nd row2431506
3rd row5224383
4th row2446249
5th row2467415
ValueCountFrequency (%)
2431491 75714
 
13.3%
2431539 13092
 
2.3%
2431224 10146
 
1.8%
2431506 9986
 
1.8%
2431516 8012
 
1.4%
2431529 7074
 
1.2%
2431489 6103
 
1.1%
2431484 5929
 
1.0%
2431219 4681
 
0.8%
2431510 4614
 
0.8%
Other values (7276) 423839
74.5%
2025-01-08T17:56:22.477040image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 839953
21.0%
4 732856
18.3%
1 542210
13.6%
3 452620
11.3%
5 325394
 
8.1%
9 290547
 
7.3%
6 220240
 
5.5%
8 210077
 
5.3%
0 193852
 
4.8%
7 193523
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4001272
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 839953
21.0%
4 732856
18.3%
1 542210
13.6%
3 452620
11.3%
5 325394
 
8.1%
9 290547
 
7.3%
6 220240
 
5.5%
8 210077
 
5.3%
0 193852
 
4.8%
7 193523
 
4.8%

Most occurring scripts

ValueCountFrequency (%)
Common 4001272
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 839953
21.0%
4 732856
18.3%
1 542210
13.6%
3 452620
11.3%
5 325394
 
8.1%
9 290547
 
7.3%
6 220240
 
5.5%
8 210077
 
5.3%
0 193852
 
4.8%
7 193523
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4001272
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 839953
21.0%
4 732856
18.3%
1 542210
13.6%
3 452620
11.3%
5 325394
 
8.1%
9 290547
 
7.3%
6 220240
 
5.5%
8 210077
 
5.3%
0 193852
 
4.8%
7 193523
 
4.8%

species
Text

Missing 

Distinct7285
Distinct (%)1.3%
Missing15011
Missing (%)2.6%
Memory size4.5 MiB
2025-01-08T17:56:22.662288image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length34
Median length31
Mean length19.39421459
Min length9

Characters and Unicode

Total characters11038993
Distinct characters54
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1233 ?
Unique (%)0.2%

Sample

1st rowCarlia bicarinata
2nd rowPlethodon montanus
3rd rowEnhydris enhydris
4th rowGehyra mutilata
5th rowAnolis richardii
ValueCountFrequency (%)
plethodon 165820
 
14.6%
cinereus 77201
 
6.8%
desmognathus 34836
 
3.1%
anolis 18232
 
1.6%
glutinosus 13098
 
1.2%
lithobates 12881
 
1.1%
fuscus 10914
 
1.0%
montanus 10391
 
0.9%
eleutherodactylus 9909
 
0.9%
anaxyrus 9456
 
0.8%
Other values (6398) 775642
68.1%
2025-01-08T17:56:22.904768image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 936730
 
8.5%
o 931837
 
8.4%
s 908046
 
8.2%
a 828057
 
7.5%
i 769871
 
7.0%
n 699609
 
6.3%
t 680178
 
6.2%
l 614875
 
5.6%
r 614619
 
5.6%
u 607807
 
5.5%
Other values (44) 3447364
31.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 9900052
89.7%
Space Separator 569190
 
5.2%
Uppercase Letter 569190
 
5.2%
Dash Punctuation 561
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 936730
9.5%
o 931837
9.4%
s 908046
 
9.2%
a 828057
 
8.4%
i 769871
 
7.8%
n 699609
 
7.1%
t 680178
 
6.9%
l 614875
 
6.2%
r 614619
 
6.2%
u 607807
 
6.1%
Other values (16) 2308423
23.3%
Uppercase Letter
ValueCountFrequency (%)
P 206462
36.3%
A 55946
 
9.8%
D 52070
 
9.1%
L 36458
 
6.4%
S 33536
 
5.9%
E 33482
 
5.9%
C 30807
 
5.4%
H 18383
 
3.2%
T 17582
 
3.1%
B 13484
 
2.4%
Other values (16) 70980
 
12.5%
Space Separator
ValueCountFrequency (%)
569190
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 561
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10469242
94.8%
Common 569751
 
5.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 936730
 
8.9%
o 931837
 
8.9%
s 908046
 
8.7%
a 828057
 
7.9%
i 769871
 
7.4%
n 699609
 
6.7%
t 680178
 
6.5%
l 614875
 
5.9%
r 614619
 
5.9%
u 607807
 
5.8%
Other values (42) 2877613
27.5%
Common
ValueCountFrequency (%)
569190
99.9%
- 561
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11038993
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 936730
 
8.5%
o 931837
 
8.4%
s 908046
 
8.2%
a 828057
 
7.5%
i 769871
 
7.0%
n 699609
 
6.3%
t 680178
 
6.2%
l 614875
 
5.6%
r 614619
 
5.6%
u 607807
 
5.5%
Other values (44) 3447364
31.2%
Distinct8475
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:23.090717image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length182
Median length112
Mean length35.64721046
Min length5

Characters and Unicode

Total characters20825136
Distinct characters88
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1520 ?
Unique (%)0.3%

Sample

1st rowCarlia bicarinata (Macleay, 1877)
2nd rowPlethodon montanus Highton & Peabody, 2000
3rd rowEnhydris enhydris (Schneider, 1799)
4th rowGehyra mutilata (Wiegmann, 1834)
5th rowAnolis richardii Duméril & Bibron, 1837
ValueCountFrequency (%)
plethodon 168423
 
6.7%
green 95577
 
3.8%
1818 93564
 
3.7%
82003
 
3.3%
cinereus 77201
 
3.1%
desmognathus 35846
 
1.4%
cope 33460
 
1.3%
duméril 27066
 
1.1%
linnaeus 26934
 
1.1%
bibron 23993
 
1.0%
Other values (8473) 1852441
73.6%
2025-01-08T17:56:23.351697image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1932307
 
9.3%
e 1568353
 
7.5%
o 1194544
 
5.7%
n 1166456
 
5.6%
a 1137420
 
5.5%
i 1101341
 
5.3%
s 1056757
 
5.1%
r 1050847
 
5.0%
t 855218
 
4.1%
l 825579
 
4.0%
Other values (78) 8936314
42.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13876417
66.6%
Decimal Number 2316084
 
11.1%
Space Separator 1932307
 
9.3%
Uppercase Letter 1283545
 
6.2%
Other Punctuation 673329
 
3.2%
Open Punctuation 368234
 
1.8%
Close Punctuation 368234
 
1.8%
Dash Punctuation 6986
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1568353
11.3%
o 1194544
 
8.6%
n 1166456
 
8.4%
a 1137420
 
8.2%
i 1101341
 
7.9%
s 1056757
 
7.6%
r 1050847
 
7.6%
t 855218
 
6.2%
l 825579
 
5.9%
u 775518
 
5.6%
Other values (32) 3144384
22.7%
Uppercase Letter
ValueCountFrequency (%)
P 235206
18.3%
G 164279
12.8%
D 115545
9.0%
B 113754
8.9%
L 96736
7.5%
S 85411
 
6.7%
C 82764
 
6.4%
H 78166
 
6.1%
A 63159
 
4.9%
E 37002
 
2.9%
Other values (18) 211523
16.5%
Decimal Number
ValueCountFrequency (%)
1 726593
31.4%
8 564818
24.4%
9 223745
 
9.7%
2 150515
 
6.5%
0 136547
 
5.9%
5 119543
 
5.2%
7 114778
 
5.0%
6 102349
 
4.4%
3 97140
 
4.2%
4 80056
 
3.5%
Other Punctuation
ValueCountFrequency (%)
, 590016
87.6%
& 82003
 
12.2%
' 797
 
0.1%
. 513
 
0.1%
Space Separator
ValueCountFrequency (%)
1932307
100.0%
Open Punctuation
ValueCountFrequency (%)
( 368234
100.0%
Close Punctuation
ValueCountFrequency (%)
) 368234
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6986
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15159962
72.8%
Common 5665174
 
27.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1568353
 
10.3%
o 1194544
 
7.9%
n 1166456
 
7.7%
a 1137420
 
7.5%
i 1101341
 
7.3%
s 1056757
 
7.0%
r 1050847
 
6.9%
t 855218
 
5.6%
l 825579
 
5.4%
u 775518
 
5.1%
Other values (60) 4427929
29.2%
Common
ValueCountFrequency (%)
1932307
34.1%
1 726593
 
12.8%
, 590016
 
10.4%
8 564818
 
10.0%
( 368234
 
6.5%
) 368234
 
6.5%
9 223745
 
3.9%
2 150515
 
2.7%
0 136547
 
2.4%
5 119543
 
2.1%
Other values (8) 484622
 
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20779927
99.8%
None 45209
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1932307
 
9.3%
e 1568353
 
7.5%
o 1194544
 
5.7%
n 1166456
 
5.6%
a 1137420
 
5.5%
i 1101341
 
5.3%
s 1056757
 
5.1%
r 1050847
 
5.1%
t 855218
 
4.1%
l 825579
 
4.0%
Other values (60) 8891105
42.8%
None
ValueCountFrequency (%)
é 30067
66.5%
ü 10808
 
23.9%
è 1828
 
4.0%
ö 1268
 
2.8%
í 442
 
1.0%
Ö 294
 
0.7%
ñ 248
 
0.5%
á 137
 
0.3%
ó 64
 
0.1%
å 20
 
< 0.1%
Other values (8) 33
 
0.1%
Distinct9530
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:23.548347image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length62
Median length56
Mean length19.84556343
Min length4

Characters and Unicode

Total characters11593798
Distinct characters59
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1890 ?
Unique (%)0.3%

Sample

1st rowCarlia bicarinata
2nd rowPlethodon montanus
3rd rowEnhydris enhydris
4th rowGehyra mutilata
5th rowAnolis richardii
ValueCountFrequency (%)
plethodon 168423
 
14.0%
cinereus 75774
 
6.3%
desmognathus 35846
 
3.0%
anolis 18352
 
1.5%
glutinosus 13372
 
1.1%
lithobates 12991
 
1.1%
fuscus 11321
 
0.9%
montanus 10417
 
0.9%
eleutherodactylus 9959
 
0.8%
anaxyrus 9474
 
0.8%
Other values (7195) 837184
69.6%
2025-01-08T17:56:23.806855image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 976778
 
8.4%
o 954528
 
8.2%
s 947788
 
8.2%
a 896948
 
7.7%
i 821046
 
7.1%
n 729826
 
6.3%
t 711935
 
6.1%
l 642687
 
5.5%
u 635392
 
5.5%
r 633897
 
5.5%
Other values (49) 3642973
31.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10381385
89.5%
Space Separator 618912
 
5.3%
Uppercase Letter 582830
 
5.0%
Other Punctuation 10078
 
0.1%
Dash Punctuation 561
 
< 0.1%
Open Punctuation 16
 
< 0.1%
Close Punctuation 16
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 976778
9.4%
o 954528
 
9.2%
s 947788
 
9.1%
a 896948
 
8.6%
i 821046
 
7.9%
n 729826
 
7.0%
t 711935
 
6.9%
l 642687
 
6.2%
u 635392
 
6.1%
r 633897
 
6.1%
Other values (16) 2430560
23.4%
Uppercase Letter
ValueCountFrequency (%)
P 210261
36.1%
A 59585
 
10.2%
D 48691
 
8.4%
L 39048
 
6.7%
E 33682
 
5.8%
S 33117
 
5.7%
C 32240
 
5.5%
H 26213
 
4.5%
T 17139
 
2.9%
R 13689
 
2.3%
Other values (15) 69165
 
11.9%
Other Punctuation
ValueCountFrequency (%)
" 8496
84.3%
. 1545
 
15.3%
/ 21
 
0.2%
? 16
 
0.2%
Space Separator
ValueCountFrequency (%)
618912
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 561
100.0%
Open Punctuation
ValueCountFrequency (%)
( 16
100.0%
Close Punctuation
ValueCountFrequency (%)
) 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10964215
94.6%
Common 629583
 
5.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 976778
 
8.9%
o 954528
 
8.7%
s 947788
 
8.6%
a 896948
 
8.2%
i 821046
 
7.5%
n 729826
 
6.7%
t 711935
 
6.5%
l 642687
 
5.9%
u 635392
 
5.8%
r 633897
 
5.8%
Other values (41) 3013390
27.5%
Common
ValueCountFrequency (%)
618912
98.3%
" 8496
 
1.3%
. 1545
 
0.2%
- 561
 
0.1%
/ 21
 
< 0.1%
( 16
 
< 0.1%
? 16
 
< 0.1%
) 16
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11593798
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 976778
 
8.4%
o 954528
 
8.2%
s 947788
 
8.2%
a 896948
 
7.7%
i 821046
 
7.1%
n 729826
 
6.3%
t 711935
 
6.1%
l 642687
 
5.5%
u 635392
 
5.5%
r 633897
 
5.5%
Other values (49) 3642973
31.4%

protocol
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:23.860035image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1752603
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEML
2nd rowEML
3rd rowEML
4th rowEML
5th rowEML
ValueCountFrequency (%)
eml 584201
100.0%
2025-01-08T17:56:23.952359image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 584201
33.3%
M 584201
33.3%
L 584201
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1752603
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 584201
33.3%
M 584201
33.3%
L 584201
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1752603
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 584201
33.3%
M 584201
33.3%
L 584201
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1752603
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 584201
33.3%
M 584201
33.3%
L 584201
33.3%
Distinct186736
Distinct (%)32.0%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:24.094987image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99567957
Min length20

Characters and Unicode

Total characters14018300
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique40383 ?
Unique (%)6.9%

Sample

1st row2024-12-02T13:56:06.739Z
2nd row2024-12-02T13:56:08.224Z
3rd row2024-12-02T13:55:56.801Z
4th row2024-12-02T13:59:51.499Z
5th row2024-12-02T13:58:04.592Z
ValueCountFrequency (%)
2024-12-02t13:57:45.601z 17
 
< 0.1%
2024-12-02t13:57:52.847z 16
 
< 0.1%
2024-12-02t13:57:54.221z 16
 
< 0.1%
2024-12-02t13:57:23.249z 16
 
< 0.1%
2024-12-02t13:57:51.135z 16
 
< 0.1%
2024-12-02t13:57:50.745z 15
 
< 0.1%
2024-12-02t13:58:01.663z 15
 
< 0.1%
2024-12-02t13:56:52.538z 15
 
< 0.1%
2024-12-02t13:57:30.398z 15
 
< 0.1%
2024-12-02t13:57:53.169z 15
 
< 0.1%
Other values (186726) 584045
> 99.9%
2025-01-08T17:56:24.315097image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2668002
19.0%
0 1480784
10.6%
1 1472907
10.5%
- 1168402
8.3%
: 1168402
8.3%
4 939301
 
6.7%
5 927875
 
6.6%
3 926225
 
6.6%
T 584201
 
4.2%
Z 584201
 
4.2%
Other values (5) 2098000
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9929524
70.8%
Other Punctuation 1751972
 
12.5%
Dash Punctuation 1168402
 
8.3%
Uppercase Letter 1168402
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2668002
26.9%
0 1480784
14.9%
1 1472907
14.8%
4 939301
 
9.5%
5 927875
 
9.3%
3 926225
 
9.3%
7 448284
 
4.5%
9 373157
 
3.8%
6 352898
 
3.6%
8 340091
 
3.4%
Other Punctuation
ValueCountFrequency (%)
: 1168402
66.7%
. 583570
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 584201
50.0%
Z 584201
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1168402
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12849898
91.7%
Latin 1168402
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2668002
20.8%
0 1480784
11.5%
1 1472907
11.5%
- 1168402
9.1%
: 1168402
9.1%
4 939301
 
7.3%
5 927875
 
7.2%
3 926225
 
7.2%
. 583570
 
4.5%
7 448284
 
3.5%
Other values (3) 1066146
 
8.3%
Latin
ValueCountFrequency (%)
T 584201
50.0%
Z 584201
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14018300
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2668002
19.0%
0 1480784
10.6%
1 1472907
10.5%
- 1168402
8.3%
: 1168402
8.3%
4 939301
 
6.7%
5 927875
 
6.6%
3 926225
 
6.6%
T 584201
 
4.2%
Z 584201
 
4.2%
Other values (5) 2098000
15.0%

lastCrawled
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:24.377284image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters14020824
Distinct characters12
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2024-12-02T11:48:23.416Z
2nd row2024-12-02T11:48:23.416Z
3rd row2024-12-02T11:48:23.416Z
4th row2024-12-02T11:48:23.416Z
5th row2024-12-02T11:48:23.416Z
ValueCountFrequency (%)
2024-12-02t11:48:23.416z 584201
100.0%
2025-01-08T17:56:24.475545image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2921005
20.8%
1 2336804
16.7%
4 1752603
12.5%
0 1168402
 
8.3%
- 1168402
 
8.3%
: 1168402
 
8.3%
T 584201
 
4.2%
8 584201
 
4.2%
3 584201
 
4.2%
. 584201
 
4.2%
Other values (2) 1168402
 
8.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9931417
70.8%
Other Punctuation 1752603
 
12.5%
Dash Punctuation 1168402
 
8.3%
Uppercase Letter 1168402
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2921005
29.4%
1 2336804
23.5%
4 1752603
17.6%
0 1168402
 
11.8%
8 584201
 
5.9%
3 584201
 
5.9%
6 584201
 
5.9%
Other Punctuation
ValueCountFrequency (%)
: 1168402
66.7%
. 584201
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 584201
50.0%
Z 584201
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1168402
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12852422
91.7%
Latin 1168402
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2921005
22.7%
1 2336804
18.2%
4 1752603
13.6%
0 1168402
 
9.1%
- 1168402
 
9.1%
: 1168402
 
9.1%
8 584201
 
4.5%
3 584201
 
4.5%
. 584201
 
4.5%
6 584201
 
4.5%
Latin
ValueCountFrequency (%)
T 584201
50.0%
Z 584201
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14020824
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2921005
20.8%
1 2336804
16.7%
4 1752603
12.5%
0 1168402
 
8.3%
- 1168402
 
8.3%
: 1168402
 
8.3%
T 584201
 
4.2%
8 584201
 
4.2%
3 584201
 
4.2%
. 584201
 
4.2%
Other values (2) 1168402
 
8.3%

repatriated
Text

Missing 

Distinct2
Distinct (%)< 0.1%
Missing10596
Missing (%)1.8%
Memory size4.5 MiB
2025-01-08T17:56:24.518546image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.582658798
Min length4

Characters and Unicode

Total characters2628636
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowtrue
2nd rowfalse
3rd rowtrue
4th rowtrue
5th rowfalse
ValueCountFrequency (%)
false 334216
58.3%
true 239389
41.7%
2025-01-08T17:56:24.616513image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 573605
21.8%
f 334216
12.7%
a 334216
12.7%
l 334216
12.7%
s 334216
12.7%
t 239389
9.1%
r 239389
9.1%
u 239389
9.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2628636
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 573605
21.8%
f 334216
12.7%
a 334216
12.7%
l 334216
12.7%
s 334216
12.7%
t 239389
9.1%
r 239389
9.1%
u 239389
9.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 2628636
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 573605
21.8%
f 334216
12.7%
a 334216
12.7%
l 334216
12.7%
s 334216
12.7%
t 239389
9.1%
r 239389
9.1%
u 239389
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2628636
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 573605
21.8%
f 334216
12.7%
a 334216
12.7%
l 334216
12.7%
s 334216
12.7%
t 239389
9.1%
r 239389
9.1%
u 239389
9.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:24.656429image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.998765836
Min length4

Characters and Unicode

Total characters2920284
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 583480
99.9%
true 721
 
0.1%
2025-01-08T17:56:24.750860image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 584201
20.0%
f 583480
20.0%
a 583480
20.0%
l 583480
20.0%
s 583480
20.0%
t 721
 
< 0.1%
r 721
 
< 0.1%
u 721
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2920284
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 584201
20.0%
f 583480
20.0%
a 583480
20.0%
l 583480
20.0%
s 583480
20.0%
t 721
 
< 0.1%
r 721
 
< 0.1%
u 721
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 2920284
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 584201
20.0%
f 583480
20.0%
a 583480
20.0%
l 583480
20.0%
s 583480
20.0%
t 721
 
< 0.1%
r 721
 
< 0.1%
u 721
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2920284
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 584201
20.0%
f 583480
20.0%
a 583480
20.0%
l 583480
20.0%
s 583480
20.0%
t 721
 
< 0.1%
r 721
 
< 0.1%
u 721
 
< 0.1%

gbifRegion
Text

Missing 

Distinct6
Distinct (%)< 0.1%
Missing11409
Missing (%)2.0%
Memory size4.5 MiB
2025-01-08T17:56:24.797859image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length11.80906158
Min length4

Characters and Unicode

Total characters6764136
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOCEANIA
2nd rowNORTH_AMERICA
3rd rowOCEANIA
4th rowLATIN_AMERICA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 335375
58.6%
latin_america 147208
25.7%
asia 39442
 
6.9%
oceania 28187
 
4.9%
africa 19937
 
3.5%
europe 2643
 
0.5%
2025-01-08T17:56:25.027241image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 1287506
19.0%
R 840538
12.4%
I 717357
10.6%
C 530707
7.8%
E 516056
7.6%
N 510770
 
7.6%
T 482583
 
7.1%
_ 482583
 
7.1%
M 482583
 
7.1%
O 366205
 
5.4%
Other values (6) 547248
8.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 6281553
92.9%
Connector Punctuation 482583
 
7.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 1287506
20.5%
R 840538
13.4%
I 717357
11.4%
C 530707
8.4%
E 516056
8.2%
N 510770
 
8.1%
T 482583
 
7.7%
M 482583
 
7.7%
O 366205
 
5.8%
H 335375
 
5.3%
Other values (5) 211873
 
3.4%
Connector Punctuation
ValueCountFrequency (%)
_ 482583
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6281553
92.9%
Common 482583
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 1287506
20.5%
R 840538
13.4%
I 717357
11.4%
C 530707
8.4%
E 516056
8.2%
N 510770
 
8.1%
T 482583
 
7.7%
M 482583
 
7.7%
O 366205
 
5.8%
H 335375
 
5.3%
Other values (5) 211873
 
3.4%
Common
ValueCountFrequency (%)
_ 482583
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6764136
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 1287506
19.0%
R 840538
12.4%
I 717357
10.6%
C 530707
7.8%
E 516056
7.6%
N 510770
 
7.6%
T 482583
 
7.1%
_ 482583
 
7.1%
M 482583
 
7.1%
O 366205
 
5.4%
Other values (6) 547248
8.1%

publishedByGbifRegion
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-08T17:56:25.071981image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters7594613
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNORTH_AMERICA
2nd rowNORTH_AMERICA
3rd rowNORTH_AMERICA
4th rowNORTH_AMERICA
5th rowNORTH_AMERICA
ValueCountFrequency (%)
north_america 584201
100.0%
2025-01-08T17:56:25.164577image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
R 1168402
15.4%
A 1168402
15.4%
N 584201
7.7%
O 584201
7.7%
T 584201
7.7%
H 584201
7.7%
_ 584201
7.7%
M 584201
7.7%
E 584201
7.7%
I 584201
7.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 7010412
92.3%
Connector Punctuation 584201
 
7.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 1168402
16.7%
A 1168402
16.7%
N 584201
8.3%
O 584201
8.3%
T 584201
8.3%
H 584201
8.3%
M 584201
8.3%
E 584201
8.3%
I 584201
8.3%
C 584201
8.3%
Connector Punctuation
ValueCountFrequency (%)
_ 584201
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7010412
92.3%
Common 584201
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 1168402
16.7%
A 1168402
16.7%
N 584201
8.3%
O 584201
8.3%
T 584201
8.3%
H 584201
8.3%
M 584201
8.3%
E 584201
8.3%
I 584201
8.3%
C 584201
8.3%
Common
ValueCountFrequency (%)
_ 584201
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7594613
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 1168402
15.4%
A 1168402
15.4%
N 584201
7.7%
O 584201
7.7%
T 584201
7.7%
H 584201
7.7%
_ 584201
7.7%
M 584201
7.7%
E 584201
7.7%
I 584201
7.7%

level0Gid
Text

Missing 

Distinct175
Distinct (%)< 0.1%
Missing173676
Missing (%)29.7%
Memory size4.5 MiB
2025-01-08T17:56:25.298796image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1231575
Distinct characters28
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)< 0.1%

Sample

1st rowPNG
2nd rowUSA
3rd rowGRD
4th rowUSA
5th rowUSA
ValueCountFrequency (%)
usa 282827
68.9%
ecu 14871
 
3.6%
bra 13519
 
3.3%
per 12508
 
3.0%
hnd 10032
 
2.4%
mex 4961
 
1.2%
dom 4618
 
1.1%
cub 3855
 
0.9%
png 3606
 
0.9%
hti 3483
 
0.8%
Other values (165) 56245
 
13.7%
2025-01-08T17:56:25.478806image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 308853
25.1%
A 306743
24.9%
S 286617
23.3%
R 40442
 
3.3%
E 40339
 
3.3%
P 28015
 
2.3%
N 27326
 
2.2%
C 26870
 
2.2%
M 25313
 
2.1%
B 21622
 
1.8%
Other values (18) 119435
 
9.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1231567
> 99.9%
Decimal Number 8
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 308853
25.1%
A 306743
24.9%
S 286617
23.3%
R 40442
 
3.3%
E 40339
 
3.3%
P 28015
 
2.3%
N 27326
 
2.2%
C 26870
 
2.2%
M 25313
 
2.1%
B 21622
 
1.8%
Other values (16) 119427
 
9.7%
Decimal Number
ValueCountFrequency (%)
0 4
50.0%
6 4
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1231567
> 99.9%
Common 8
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 308853
25.1%
A 306743
24.9%
S 286617
23.3%
R 40442
 
3.3%
E 40339
 
3.3%
P 28015
 
2.3%
N 27326
 
2.2%
C 26870
 
2.2%
M 25313
 
2.1%
B 21622
 
1.8%
Other values (16) 119427
 
9.7%
Common
ValueCountFrequency (%)
0 4
50.0%
6 4
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1231575
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 308853
25.1%
A 306743
24.9%
S 286617
23.3%
R 40442
 
3.3%
E 40339
 
3.3%
P 28015
 
2.3%
N 27326
 
2.2%
C 26870
 
2.2%
M 25313
 
2.1%
B 21622
 
1.8%
Other values (18) 119435
 
9.7%

level0Name
Text

Missing 

Distinct175
Distinct (%)< 0.1%
Missing173676
Missing (%)29.7%
Memory size4.5 MiB
2025-01-08T17:56:25.648653image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length13
Mean length11.47883564
Min length4

Characters and Unicode

Total characters4712349
Distinct characters58
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)< 0.1%

Sample

1st rowPapua New Guinea
2nd rowUnited States
3rd rowGrenada
4th rowUnited States
5th rowUnited States
ValueCountFrequency (%)
united 283460
38.9%
states 283455
38.9%
ecuador 14871
 
2.0%
brazil 13519
 
1.9%
peru 12508
 
1.7%
honduras 10032
 
1.4%
republic 5833
 
0.8%
méxico 4961
 
0.7%
dominican 4618
 
0.6%
guinea 3935
 
0.5%
Other values (203) 92017
 
12.6%
2025-01-08T17:56:25.882183image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 869782
18.5%
e 617316
13.1%
a 436923
9.3%
i 361412
7.7%
n 346814
 
7.4%
d 321822
 
6.8%
318684
 
6.8%
s 308278
 
6.5%
S 286866
 
6.1%
U 283929
 
6.0%
Other values (48) 560523
11.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3669150
77.9%
Uppercase Letter 724199
 
15.4%
Space Separator 318684
 
6.8%
Other Punctuation 280
 
< 0.1%
Dash Punctuation 36
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 869782
23.7%
e 617316
16.8%
a 436923
11.9%
i 361412
9.9%
n 346814
 
9.5%
d 321822
 
8.8%
s 308278
 
8.4%
r 74760
 
2.0%
u 73108
 
2.0%
o 63163
 
1.7%
Other values (20) 195772
 
5.3%
Uppercase Letter
ValueCountFrequency (%)
S 286866
39.6%
U 283929
39.2%
P 24491
 
3.4%
E 17791
 
2.5%
B 16468
 
2.3%
H 13529
 
1.9%
M 12279
 
1.7%
C 12037
 
1.7%
R 10730
 
1.5%
G 10373
 
1.4%
Other values (13) 35706
 
4.9%
Other Punctuation
ValueCountFrequency (%)
, 116
41.4%
. 104
37.1%
' 60
21.4%
Space Separator
ValueCountFrequency (%)
318684
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 36
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4393349
93.2%
Common 319000
 
6.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 869782
19.8%
e 617316
14.1%
a 436923
9.9%
i 361412
8.2%
n 346814
 
7.9%
d 321822
 
7.3%
s 308278
 
7.0%
S 286866
 
6.5%
U 283929
 
6.5%
r 74760
 
1.7%
Other values (43) 485447
11.0%
Common
ValueCountFrequency (%)
318684
99.9%
, 116
 
< 0.1%
. 104
 
< 0.1%
' 60
 
< 0.1%
- 36
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4704801
99.8%
None 7548
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 869782
18.5%
e 617316
13.1%
a 436923
9.3%
i 361412
7.7%
n 346814
 
7.4%
d 321822
 
6.8%
318684
 
6.8%
s 308278
 
6.6%
S 286866
 
6.1%
U 283929
 
6.0%
Other values (44) 552975
11.8%
None
ValueCountFrequency (%)
é 5812
77.0%
ã 838
 
11.1%
í 838
 
11.1%
ô 60
 
0.8%

level1Gid
Text

Missing 

Distinct1192
Distinct (%)0.3%
Missing174349
Missing (%)29.8%
Memory size4.5 MiB
2025-01-08T17:56:26.075451image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.801640592
Min length6

Characters and Unicode

Total characters3197518
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique127 ?
Unique (%)< 0.1%

Sample

1st rowPNG.2_1
2nd rowUSA.34_1
3rd rowGRD.4_1
4th rowUSA.47_1
5th rowUSA.29_1
ValueCountFrequency (%)
usa.47_1 68346
 
16.7%
usa.34_1 51010
 
12.4%
usa.21_1 30907
 
7.5%
usa.39_1 18483
 
4.5%
usa.49_1 17227
 
4.2%
usa.43_1 10691
 
2.6%
usa.5_1 8858
 
2.2%
usa.11_1 8403
 
2.1%
usa.10_1 7784
 
1.9%
usa.37_1 5139
 
1.3%
Other values (1182) 183004
44.7%
2025-01-08T17:56:26.332292image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 548162
17.1%
_ 409852
12.8%
. 409646
12.8%
U 308853
9.7%
A 306694
9.6%
S 286615
9.0%
4 178904
 
5.6%
3 121201
 
3.8%
7 85072
 
2.7%
2 75286
 
2.4%
Other values (28) 467233
14.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1229548
38.5%
Decimal Number 1148472
35.9%
Connector Punctuation 409852
 
12.8%
Other Punctuation 409646
 
12.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 308853
25.1%
A 306694
24.9%
S 286615
23.3%
E 40339
 
3.3%
R 39839
 
3.2%
P 28004
 
2.3%
N 27315
 
2.2%
C 26851
 
2.2%
M 25289
 
2.1%
B 21595
 
1.8%
Other values (16) 118154
 
9.6%
Decimal Number
ValueCountFrequency (%)
1 548162
47.7%
4 178904
 
15.6%
3 121201
 
10.6%
7 85072
 
7.4%
2 75286
 
6.6%
9 49176
 
4.3%
5 33537
 
2.9%
8 25139
 
2.2%
6 17641
 
1.5%
0 14354
 
1.2%
Connector Punctuation
ValueCountFrequency (%)
_ 409852
100.0%
Other Punctuation
ValueCountFrequency (%)
. 409646
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1967970
61.5%
Latin 1229548
38.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 308853
25.1%
A 306694
24.9%
S 286615
23.3%
E 40339
 
3.3%
R 39839
 
3.2%
P 28004
 
2.3%
N 27315
 
2.2%
C 26851
 
2.2%
M 25289
 
2.1%
B 21595
 
1.8%
Other values (16) 118154
 
9.6%
Common
ValueCountFrequency (%)
1 548162
27.9%
_ 409852
20.8%
. 409646
20.8%
4 178904
 
9.1%
3 121201
 
6.2%
7 85072
 
4.3%
2 75286
 
3.8%
9 49176
 
2.5%
5 33537
 
1.7%
8 25139
 
1.3%
Other values (2) 31995
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3197518
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 548162
17.1%
_ 409852
12.8%
. 409646
12.8%
U 308853
9.7%
A 306694
9.6%
S 286615
9.0%
4 178904
 
5.6%
3 121201
 
3.8%
7 85072
 
2.7%
2 75286
 
2.4%
Other values (28) 467233
14.6%

level1Name
Text

Missing 

Distinct1147
Distinct (%)0.3%
Missing174349
Missing (%)29.8%
Memory size4.5 MiB
2025-01-08T17:56:26.518153image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length30
Mean length9.58385466
Min length3

Characters and Unicode

Total characters3927962
Distinct characters99
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique118 ?
Unique (%)< 0.1%

Sample

1st rowCentral
2nd rowNorth Carolina
3rd rowSaint George
4th rowVirginia
5th rowNevada
ValueCountFrequency (%)
virginia 85573
 
15.6%
carolina 55602
 
10.1%
north 51399
 
9.3%
maryland 30907
 
5.6%
pennsylvania 18483
 
3.4%
west 17320
 
3.1%
tennessee 10691
 
1.9%
california 9457
 
1.7%
georgia 8403
 
1.5%
de 7912
 
1.4%
Other values (1285) 254513
46.3%
2025-01-08T17:56:26.763579image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 585330
14.9%
i 477589
12.2%
n 369556
 
9.4%
r 331611
 
8.4%
o 265471
 
6.8%
l 174911
 
4.5%
e 169093
 
4.3%
s 148236
 
3.8%
140408
 
3.6%
t 129070
 
3.3%
Other values (89) 1136687
28.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3241955
82.5%
Uppercase Letter 539254
 
13.7%
Space Separator 140408
 
3.6%
Dash Punctuation 4354
 
0.1%
Other Punctuation 1959
 
< 0.1%
Modifier Symbol 30
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 585330
18.1%
i 477589
14.7%
n 369556
11.4%
r 331611
10.2%
o 265471
8.2%
l 174911
 
5.4%
e 169093
 
5.2%
s 148236
 
4.6%
t 129070
 
4.0%
g 116837
 
3.6%
Other values (49) 474251
14.6%
Uppercase Letter
ValueCountFrequency (%)
V 87515
16.2%
C 81820
15.2%
N 67344
12.5%
M 54002
10.0%
P 35817
 
6.6%
S 25857
 
4.8%
T 24573
 
4.6%
A 22550
 
4.2%
W 21290
 
3.9%
G 18587
 
3.4%
Other values (20) 99899
18.5%
Other Punctuation
ValueCountFrequency (%)
' 1416
72.3%
/ 451
 
23.0%
! 44
 
2.2%
. 38
 
1.9%
, 10
 
0.5%
Space Separator
ValueCountFrequency (%)
140408
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4354
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 30
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 1
100.0%
Close Punctuation
ValueCountFrequency (%)
] 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3781209
96.3%
Common 146753
 
3.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 585330
15.5%
i 477589
12.6%
n 369556
 
9.8%
r 331611
 
8.8%
o 265471
 
7.0%
l 174911
 
4.6%
e 169093
 
4.5%
s 148236
 
3.9%
t 129070
 
3.4%
g 116837
 
3.1%
Other values (79) 1013505
26.8%
Common
ValueCountFrequency (%)
140408
95.7%
- 4354
 
3.0%
' 1416
 
1.0%
/ 451
 
0.3%
! 44
 
< 0.1%
. 38
 
< 0.1%
` 30
 
< 0.1%
, 10
 
< 0.1%
[ 1
 
< 0.1%
] 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3904533
99.4%
None 23265
 
0.6%
Latin Ext Additional 164
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 585330
15.0%
i 477589
12.2%
n 369556
 
9.5%
r 331611
 
8.5%
o 265471
 
6.8%
l 174911
 
4.5%
e 169093
 
4.3%
s 148236
 
3.8%
140408
 
3.6%
t 129070
 
3.3%
Other values (52) 1113258
28.5%
None
ValueCountFrequency (%)
á 7994
34.4%
é 4043
17.4%
í 3671
15.8%
ã 3233
13.9%
ó 1380
 
5.9%
ô 910
 
3.9%
ú 598
 
2.6%
ñ 567
 
2.4%
ü 278
 
1.2%
ï 179
 
0.8%
Other values (20) 412
 
1.8%
Latin Ext Additional
ValueCountFrequency (%)
59
36.0%
39
23.8%
35
21.3%
17
 
10.4%
5
 
3.0%
ế 5
 
3.0%
4
 
2.4%

level2Gid
Text

Missing 

Distinct4973
Distinct (%)1.2%
Missing186113
Missing (%)31.9%
Memory size4.5 MiB
2025-01-08T17:56:26.971109image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length11
Mean length10.59813157
Min length8

Characters and Unicode

Total characters4218989
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique873 ?
Unique (%)0.2%

Sample

1st rowPNG.2.3_1
2nd rowUSA.34.11_1
3rd rowUSA.47.9_1
4th rowUSA.29.5_1
5th rowBRA.19.34_2
ValueCountFrequency (%)
usa.34.87_1 9937
 
2.5%
usa.47.50_1 7933
 
2.0%
usa.21.10_1 6723
 
1.7%
usa.34.56_1 6344
 
1.6%
usa.34.44_1 5697
 
1.4%
per.1.4_1 4919
 
1.2%
usa.21.16_1 4431
 
1.1%
usa.43.78_1 3919
 
1.0%
usa.49.42_1 3487
 
0.9%
usa.47.53_1 3397
 
0.9%
Other values (4963) 341301
85.7%
2025-01-08T17:56:27.236554image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 795970
18.9%
1 660755
15.7%
_ 398088
9.4%
A 305944
 
7.3%
U 305494
 
7.2%
S 286408
 
6.8%
4 250715
 
5.9%
3 195227
 
4.6%
2 179315
 
4.3%
7 136265
 
3.2%
Other values (28) 704808
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1830675
43.4%
Uppercase Letter 1194256
28.3%
Other Punctuation 795970
18.9%
Connector Punctuation 398088
 
9.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 305944
25.6%
U 305494
25.6%
S 286408
24.0%
E 40275
 
3.4%
R 38370
 
3.2%
C 26058
 
2.2%
N 25849
 
2.2%
P 23473
 
2.0%
B 20673
 
1.7%
M 19615
 
1.6%
Other values (16) 102097
 
8.5%
Decimal Number
ValueCountFrequency (%)
1 660755
36.1%
4 250715
 
13.7%
3 195227
 
10.7%
2 179315
 
9.8%
7 136265
 
7.4%
5 106767
 
5.8%
9 82662
 
4.5%
8 81137
 
4.4%
6 74816
 
4.1%
0 63016
 
3.4%
Other Punctuation
ValueCountFrequency (%)
. 795970
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 398088
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3024733
71.7%
Latin 1194256
 
28.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 305944
25.6%
U 305494
25.6%
S 286408
24.0%
E 40275
 
3.4%
R 38370
 
3.2%
C 26058
 
2.2%
N 25849
 
2.2%
P 23473
 
2.0%
B 20673
 
1.7%
M 19615
 
1.6%
Other values (16) 102097
 
8.5%
Common
ValueCountFrequency (%)
. 795970
26.3%
1 660755
21.8%
_ 398088
13.2%
4 250715
 
8.3%
3 195227
 
6.5%
2 179315
 
5.9%
7 136265
 
4.5%
5 106767
 
3.5%
9 82662
 
2.7%
8 81137
 
2.7%
Other values (2) 137832
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4218989
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 795970
18.9%
1 660755
15.7%
_ 398088
9.4%
A 305944
 
7.3%
U 305494
 
7.2%
S 286408
 
6.8%
4 250715
 
5.9%
3 195227
 
4.6%
2 179315
 
4.3%
7 136265
 
3.2%
Other values (28) 704808
16.7%

level2Name
Text

Missing 

Distinct4138
Distinct (%)1.0%
Missing186171
Missing (%)31.9%
Memory size4.5 MiB
2025-01-08T17:56:27.429778image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length28
Mean length8.217365525
Min length2

Characters and Unicode

Total characters3270758
Distinct characters109
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique730 ?
Unique (%)0.2%

Sample

1st rowKairuku-Hiri
2nd rowBuncombe
3rd rowAugusta
4th rowElko
5th rowItatiaia
ValueCountFrequency (%)
swain 9937
 
2.0%
giles 7933
 
1.6%
frederick 7093
 
1.5%
macon 6483
 
1.3%
madison 6402
 
1.3%
de 6373
 
1.3%
haywood 5697
 
1.2%
la 5624
 
1.1%
san 5466
 
1.1%
prince 5405
 
1.1%
Other values (4402) 422665
86.4%
2025-01-08T17:56:27.686316image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 366588
 
11.2%
e 273177
 
8.4%
n 252799
 
7.7%
o 249950
 
7.6%
r 207979
 
6.4%
i 190585
 
5.8%
l 143560
 
4.4%
s 133215
 
4.1%
t 117958
 
3.6%
u 95083
 
2.9%
Other values (99) 1239864
37.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2676771
81.8%
Uppercase Letter 482454
 
14.8%
Space Separator 91048
 
2.8%
Dash Punctuation 10022
 
0.3%
Other Punctuation 8451
 
0.3%
Decimal Number 1843
 
0.1%
Open Punctuation 130
 
< 0.1%
Close Punctuation 21
 
< 0.1%
Math Symbol 18
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 366588
13.7%
e 273177
10.2%
n 252799
 
9.4%
o 249950
 
9.3%
r 207979
 
7.8%
i 190585
 
7.1%
l 143560
 
5.4%
s 133215
 
5.0%
t 117958
 
4.4%
u 95083
 
3.6%
Other values (47) 645877
24.1%
Uppercase Letter
ValueCountFrequency (%)
C 53401
 
11.1%
S 50551
 
10.5%
M 47650
 
9.9%
P 36378
 
7.5%
G 29941
 
6.2%
A 29893
 
6.2%
B 27596
 
5.7%
L 23560
 
4.9%
R 22673
 
4.7%
H 20772
 
4.3%
Other values (23) 140039
29.0%
Decimal Number
ValueCountFrequency (%)
8 903
49.0%
1 416
22.6%
0 167
 
9.1%
7 139
 
7.5%
3 77
 
4.2%
2 64
 
3.5%
5 26
 
1.4%
6 22
 
1.2%
9 21
 
1.1%
4 8
 
0.4%
Other Punctuation
ValueCountFrequency (%)
' 6982
82.6%
. 652
 
7.7%
/ 455
 
5.4%
, 362
 
4.3%
Space Separator
ValueCountFrequency (%)
91048
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10022
100.0%
Open Punctuation
ValueCountFrequency (%)
( 130
100.0%
Close Punctuation
ValueCountFrequency (%)
) 21
100.0%
Math Symbol
ValueCountFrequency (%)
+ 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3159225
96.6%
Common 111533
 
3.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 366588
 
11.6%
e 273177
 
8.6%
n 252799
 
8.0%
o 249950
 
7.9%
r 207979
 
6.6%
i 190585
 
6.0%
l 143560
 
4.5%
s 133215
 
4.2%
t 117958
 
3.7%
u 95083
 
3.0%
Other values (80) 1128331
35.7%
Common
ValueCountFrequency (%)
91048
81.6%
- 10022
 
9.0%
' 6982
 
6.3%
8 903
 
0.8%
. 652
 
0.6%
/ 455
 
0.4%
1 416
 
0.4%
, 362
 
0.3%
0 167
 
0.1%
7 139
 
0.1%
Other values (9) 387
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3243339
99.2%
None 27250
 
0.8%
Latin Ext Additional 159
 
< 0.1%
IPA Ext 10
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 366588
 
11.3%
e 273177
 
8.4%
n 252799
 
7.8%
o 249950
 
7.7%
r 207979
 
6.4%
i 190585
 
5.9%
l 143560
 
4.4%
s 133215
 
4.1%
t 117958
 
3.6%
u 95083
 
2.9%
Other values (61) 1212445
37.4%
None
ValueCountFrequency (%)
ó 7527
27.6%
á 5558
20.4%
é 5213
19.1%
í 4075
15.0%
ñ 1709
 
6.3%
ã 1117
 
4.1%
ú 495
 
1.8%
ô 244
 
0.9%
â 230
 
0.8%
ō 202
 
0.7%
Other values (21) 880
 
3.2%
Latin Ext Additional
ValueCountFrequency (%)
59
37.1%
59
37.1%
23
 
14.5%
15
 
9.4%
2
 
1.3%
1
 
0.6%
IPA Ext
ValueCountFrequency (%)
ə 10
100.0%

level3Gid
Text

Missing 

Distinct1518
Distinct (%)2.9%
Missing532468
Missing (%)91.1%
Memory size4.5 MiB
2025-01-08T17:56:27.881303image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length13
Mean length11.70438598
Min length11

Characters and Unicode

Total characters605503
Distinct characters34
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique345 ?
Unique (%)0.7%

Sample

1st rowECU.18.4.5_1
2nd rowBOL.6.5.3_2
3rd rowPER.18.3.4_1
4th rowECU.18.4.2_1
5th rowPER.1.4.3_1
ValueCountFrequency (%)
per.1.4.3_1 3333
 
6.4%
per.18.3.4_1 1833
 
3.5%
per.1.4.1_1 1584
 
3.1%
per.8.9.1_1 1099
 
2.1%
per.18.1.1_1 862
 
1.7%
cri.3.3.4_1 850
 
1.6%
pan.3.3.1_1 816
 
1.6%
per.8.11.5_1 790
 
1.5%
ecu.21.2.7_1 708
 
1.4%
mdg.6.2.3_1 683
 
1.3%
Other values (1508) 39175
75.7%
2025-01-08T17:56:28.129926image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 155199
25.6%
1 109811
18.1%
_ 51733
 
8.5%
E 29941
 
4.9%
2 26345
 
4.4%
3 23876
 
3.9%
4 23543
 
3.9%
R 20134
 
3.3%
C 19996
 
3.3%
U 14987
 
2.5%
Other values (24) 129938
21.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 243380
40.2%
Other Punctuation 155199
25.6%
Uppercase Letter 155191
25.6%
Connector Punctuation 51733
 
8.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 29941
19.3%
R 20134
13.0%
C 19996
12.9%
U 14987
9.7%
P 14622
9.4%
M 9107
 
5.9%
I 8533
 
5.5%
T 6123
 
3.9%
H 6010
 
3.9%
A 5737
 
3.7%
Other values (12) 20001
12.9%
Decimal Number
ValueCountFrequency (%)
1 109811
45.1%
2 26345
 
10.8%
3 23876
 
9.8%
4 23543
 
9.7%
8 14100
 
5.8%
5 13401
 
5.5%
6 10712
 
4.4%
7 9554
 
3.9%
9 6983
 
2.9%
0 5055
 
2.1%
Other Punctuation
ValueCountFrequency (%)
. 155199
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 51733
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 450312
74.4%
Latin 155191
 
25.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 29941
19.3%
R 20134
13.0%
C 19996
12.9%
U 14987
9.7%
P 14622
9.4%
M 9107
 
5.9%
I 8533
 
5.5%
T 6123
 
3.9%
H 6010
 
3.9%
A 5737
 
3.7%
Other values (12) 20001
12.9%
Common
ValueCountFrequency (%)
. 155199
34.5%
1 109811
24.4%
_ 51733
 
11.5%
2 26345
 
5.9%
3 23876
 
5.3%
4 23543
 
5.2%
8 14100
 
3.1%
5 13401
 
3.0%
6 10712
 
2.4%
7 9554
 
2.1%
Other values (2) 12038
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 605503
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 155199
25.6%
1 109811
18.1%
_ 51733
 
8.5%
E 29941
 
4.9%
2 26345
 
4.4%
3 23876
 
3.9%
4 23543
 
3.9%
R 20134
 
3.3%
C 19996
 
3.3%
U 14987
 
2.5%
Other values (24) 129938
21.5%

level3Name
Text

Missing 

Distinct1463
Distinct (%)2.8%
Missing532843
Missing (%)91.2%
Memory size4.5 MiB
2025-01-08T17:56:28.317920image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length28
Mean length10.63014525
Min length3

Characters and Unicode

Total characters545943
Distinct characters96
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique325 ?
Unique (%)0.6%

Sample

1st rowMontalvo (Andoas)
2nd rowCobija
3rd rowTambopata
4th rowDiez De Agosto
5th rowRio Santiago
ValueCountFrequency (%)
de 3839
 
4.4%
rio 3728
 
4.3%
santiago 3466
 
4.0%
el 3141
 
3.6%
san 1843
 
2.1%
tambopata 1833
 
2.1%
cenepa 1584
 
1.8%
santa 1305
 
1.5%
cab 1203
 
1.4%
en 1101
 
1.3%
Other values (1721) 64179
73.6%
2025-01-08T17:56:28.571862image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 83102
15.2%
o 46946
 
8.6%
35864
 
6.6%
n 35831
 
6.6%
i 30929
 
5.7%
e 29861
 
5.5%
r 26273
 
4.8%
t 20664
 
3.8%
l 19562
 
3.6%
u 17432
 
3.2%
Other values (86) 199479
36.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 410361
75.2%
Uppercase Letter 85504
 
15.7%
Space Separator 35864
 
6.6%
Open Punctuation 3997
 
0.7%
Close Punctuation 3090
 
0.6%
Other Punctuation 2866
 
0.5%
Decimal Number 2401
 
0.4%
Dash Punctuation 1860
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 83102
20.3%
o 46946
11.4%
n 35831
8.7%
i 30929
 
7.5%
e 29861
 
7.3%
r 26273
 
6.4%
t 20664
 
5.0%
l 19562
 
4.8%
u 17432
 
4.2%
s 13867
 
3.4%
Other values (40) 85894
20.9%
Uppercase Letter
ValueCountFrequency (%)
S 11425
13.4%
C 8284
 
9.7%
T 7232
 
8.5%
P 6443
 
7.5%
R 5549
 
6.5%
E 5542
 
6.5%
A 5386
 
6.3%
D 5367
 
6.3%
M 4809
 
5.6%
B 4384
 
5.1%
Other values (18) 21083
24.7%
Decimal Number
ValueCountFrequency (%)
1 666
27.7%
3 451
18.8%
0 328
13.7%
2 261
 
10.9%
4 212
 
8.8%
9 172
 
7.2%
6 142
 
5.9%
5 127
 
5.3%
7 24
 
1.0%
8 18
 
0.7%
Other Punctuation
ValueCountFrequency (%)
. 2573
89.8%
' 182
 
6.4%
, 81
 
2.8%
/ 30
 
1.0%
Space Separator
ValueCountFrequency (%)
35864
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3997
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3090
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1860
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 495865
90.8%
Common 50078
 
9.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 83102
16.8%
o 46946
 
9.5%
n 35831
 
7.2%
i 30929
 
6.2%
e 29861
 
6.0%
r 26273
 
5.3%
t 20664
 
4.2%
l 19562
 
3.9%
u 17432
 
3.5%
s 13867
 
2.8%
Other values (68) 171398
34.6%
Common
ValueCountFrequency (%)
35864
71.6%
( 3997
 
8.0%
) 3090
 
6.2%
. 2573
 
5.1%
- 1860
 
3.7%
1 666
 
1.3%
3 451
 
0.9%
0 328
 
0.7%
2 261
 
0.5%
4 212
 
0.4%
Other values (8) 776
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 542171
99.3%
None 3721
 
0.7%
Latin Ext Additional 51
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 83102
15.3%
o 46946
 
8.7%
35864
 
6.6%
n 35831
 
6.6%
i 30929
 
5.7%
e 29861
 
5.5%
r 26273
 
4.8%
t 20664
 
3.8%
l 19562
 
3.6%
u 17432
 
3.2%
Other values (60) 195707
36.1%
None
ValueCountFrequency (%)
ñ 1397
37.5%
é 875
23.5%
à 366
 
9.8%
á 232
 
6.2%
í 150
 
4.0%
ï 133
 
3.6%
ã 95
 
2.6%
â 91
 
2.4%
ó 89
 
2.4%
è 71
 
1.9%
Other values (11) 222
 
6.0%
Latin Ext Additional
ValueCountFrequency (%)
30
58.8%
12
 
23.5%
5
 
9.8%
2
 
3.9%
2
 
3.9%

iucnRedListCategory
Text

Missing 

Distinct9
Distinct (%)< 0.1%
Missing23468
Missing (%)4.0%
Memory size4.5 MiB
2025-01-08T17:56:28.627423image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1121466
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowLC
2nd rowLC
3rd rowLC
4th rowLC
5th rowLC
ValueCountFrequency (%)
lc 459843
82.0%
ne 34497
 
6.2%
nt 23131
 
4.1%
vu 21629
 
3.9%
en 10407
 
1.9%
cr 6915
 
1.2%
dd 4133
 
0.7%
ex 177
 
< 0.1%
ew 1
 
< 0.1%
2025-01-08T17:56:28.721829image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 466758
41.6%
L 459843
41.0%
N 68035
 
6.1%
E 45082
 
4.0%
T 23131
 
2.1%
V 21629
 
1.9%
U 21629
 
1.9%
D 8266
 
0.7%
R 6915
 
0.6%
X 177
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1121466
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 466758
41.6%
L 459843
41.0%
N 68035
 
6.1%
E 45082
 
4.0%
T 23131
 
2.1%
V 21629
 
1.9%
U 21629
 
1.9%
D 8266
 
0.7%
R 6915
 
0.6%
X 177
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 1121466
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 466758
41.6%
L 459843
41.0%
N 68035
 
6.1%
E 45082
 
4.0%
T 23131
 
2.1%
V 21629
 
1.9%
U 21629
 
1.9%
D 8266
 
0.7%
R 6915
 
0.6%
X 177
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1121466
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 466758
41.6%
L 459843
41.0%
N 68035
 
6.1%
E 45082
 
4.0%
T 23131
 
2.1%
V 21629
 
1.9%
U 21629
 
1.9%
D 8266
 
0.7%
R 6915
 
0.6%
X 177
 
< 0.1%